[R] list of valid R encodings.in source(...,encoding=)
h.wickham at gmail.com
Fri Jul 11 09:06:18 CEST 2014
It's documented in the Encodings section of ?file:
"As from R 3.0.0 the encoding "UTF-8-BOM" is accepted for reading and
will remove a Byte Order Mark if present (which it often is for files
and webpages generated by Microsoft applications). If it is required
(it is not recommended) when writing it should be written explicitly,
e.g. by writeChar("\ufeff", con, eos = NULL) or
writeBin(as.raw(c(0xef, 0xbb, 0xff)), binary_con)"
On Fri, Jul 11, 2014 at 12:50 AM, John McKown
<john.archie.mckown at gmail.com> wrote:
> This question was spawned by another thread entitled "R on Windows
> crashes when source'ing UTF-8 file".
> The solution to that problem was to use the _proper_ encoding=
> parameter of the source() function. But where are they documented? Or
> how do I find them in R itself? I ask because the proper encoding to
> solve the problem was "UTF-8-BOM". I got this by reading the source
> code to main/connnections.c . Not where I expect most people to go.
> I found iconvlist(). But it does not list UTF-8-BOM, only UTF8 and
> UTF-8. I got no useful response to ??BOM from the R prompt.
> My normal locale is "C" on Linux. If I use encoding="UTF-8" in the
> source() line, it fails because the BOM at the start is intepreted as
> data to be processed. If I use UTF-8-BOM instead, it succeeds. It also
> succeeds if I do Sys.setlocale("LC_ALL","en_US.utf8").
> I admit that I don't understand all (or even much) of the ins-and-outs
> of i10n, or code pages. But the UTF-8-BOM is just "weird" to me; and
> confusing since it is not documented anywhere I can find.
> There is nothing more pleasant than traveling and meeting new people!
> Genghis Khan
> Maranatha! <><
> John McKown
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help