[R] how to separate string from numbers in a large txt file

Ivan Krylov kry|ov@r00t @end|ng |rom gm@||@com
Fri May 17 21:43:46 CEST 2019


On Fri, 17 May 2019 11:36:22 -0700
Michael Boulineau <michael.p.boulineau using gmail.com> wrote:

> So, who knows what happened with the  at the beginning of [1]
> directly above.

 perl -Mutf8 -MEncode=encode,decode -Mcharnames=:full \
 -E'say charnames::viacode ord decode utf8 => encode latin1 => ""'
# ZERO WIDTH NO-BREAK SPACE

So the text seems to have been encoded in UTF-8, then decoded as
Latin-1. If you have multiple such artefacts and want to get rid of
them, try:

a <- readLines(con <- file("hangouts-conversation-6.csv.txt", encoding
= "UTF-8")); close(con); rm(con)

-- 
Best regards,
Ivan



More information about the R-help mailing list