[R] how to separate string from numbers in a large txt file
kry|ov@r00t @end|ng |rom gm@||@com
Fri May 17 21:43:46 CEST 2019
On Fri, 17 May 2019 11:36:22 -0700
Michael Boulineau <michael.p.boulineau using gmail.com> wrote:
> So, who knows what happened with the ï»¿ at the beginning of 
> directly above.
perl -Mutf8 -MEncode=encode,decode -Mcharnames=:full \
-E'say charnames::viacode ord decode utf8 => encode latin1 => "ï»¿"'
# ZERO WIDTH NO-BREAK SPACE
So the text seems to have been encoded in UTF-8, then decoded as
Latin-1. If you have multiple such artefacts and want to get rid of
a <- readLines(con <- file("hangouts-conversation-6.csv.txt", encoding
= "UTF-8")); close(con); rm(con)
More information about the R-help