[R] Please guide -- UTF-8 locale setting fails on Windows on writing

Sunny Singha sunnysingha.analytics at gmail.com
Mon Mar 28 15:46:36 CEST 2016


Hi,
I think I'm experiencing an issue regarding system Locale. I have
exported '.csv' formatted data frames gathered from various social
media platforms like facebook/twitter/G+, etc.

I observe many variable/columns consists of strings formatted similar to below:
"<U+0645><U+062D><U+0645><U+062F>
<U+0627><U+0644><U+0633><U+0648><U+0627><U+062D>"

As expected and I confirmed, in social media data, they are strings in
different languages.
Platform details are provide in the end of this mail. OS locale is set
to English (United States) hence 'R' locale is 'English_United
States.1252'

I have attempted to change it to UTF-8 but receives below warning message:

Warning message:
In Sys.setlocale("LC_ALL", "UTF-8") :
  OS reports request to set locale to "UTF-8" cannot be honored


I have gone through below forums but no resolution so far:
--- http://stackoverflow.com/questions/20571147/how-to-set-unicode-locale-in-r
--- https://stat.ethz.ch/pipermail/r-devel/2013-November/067940.html
--- http://stackoverflow.com/questions/19877676/write-utf-8-files-from-r
--- https://tomizonor.wordpress.com/2013/04/17/file-utf8-windows/
--- http://withr.me/configure-character-encoding-for-r-under-linux-and-windows/

I'm not sure whether the issue is while reading/extracting the data
from media or while writing/exporting in Windows directory, but I
don't experience similar issue in my personal Mac machine. I need some
clarification here.

How could I export the data just as I see on web ?  Please guide.

Regards,
Sunny

Platform I'm using::::::::::::::::::::::::::::
Operating System : Windows 7 Professional SP1
R version details:
platform       x86_64-w64-mingw32
arch           x86_64
os             mingw32
system         x86_64, mingw32
status
major          3
minor          2.3
year           2015
month          12
day            10
svn rev        69752
language       R
version.string R version 3.2.3 (2015-12-10)
nickname       Wooden Christmas-Tree



More information about the R-help mailing list