[R] Writing Unicode Text into Text File from R (in Windows)

Duncan Murdoch murdoch.duncan at gmail.com
Tue Feb 4 13:48:18 CET 2014


On 14-02-04 5:49 AM, Majid Einian wrote:
> Dear R Helpers,
>
> See the Code:
>
> a <- intToUtf8(1777)
> show(a)
> zz <- file(description="test.txt",open="w",encoding="UTF-8")
> cat(a, file = zz)
> close(zz)
>
> in a Unicode aware environment (such as RGui console or RStudio Console)
> you will see this as output:
>
> [1] "Û±"
>
>
> but the character is not written correctly in the file test.txt (which is
> encoded in UTF-8 without BOM) :
>
> <U+06F1>
>
> The problem seems to be this: R changes text to the locale of system (for
> me this is Arabic Windows (Codepage 1256) that does not have a relevant
> code for U+06F1, then changes it back to UTF-8 and writes it into file.
> What do I miss here?
>   How can I write a Unicode string into a text file correctly?

There are a lot of places in R where it converts strings to the local 
encoding, perhaps too many. On the other hand, maybe Windows should be 
offering UTF-8 locales by now.

I haven't tested in your locale, but I believe writeLines() to a 
connection declared to be in a UTF-8 encoding will maintain the 
encoding.  You can declare a file to be in encoding "UTF-8-BOM" if you 
want to ignore a BOM on input; I forget whether it will write one on 
output.  If it doesn't, you can always write one explicitly.

I was hoping to make some progress on this before R 3.1.0 so that more 
cases of writing strings to UTF-8 files would work, but time is running out.

Duncan Murdoch

>
>
> Majid Einian,
> Economics Researcher, Monetary and Banking Research Institute, Central Bank
> of Islamic Republic of Iran, Tehran, IRAN
> and
> PhD Candidate in "Economics", Graduate School of Management and
> Economics, Sharif University of Technology, Tehran, IRAN
>
> 	[[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list