[R] list of valid R encodings.in source(...,encoding=)

John McKown john.archie.mckown at gmail.com
Thu Jul 10 16:50:58 CEST 2014


This question was spawned by another thread entitled "R on Windows
crashes when source'ing UTF-8 file".

The solution to that problem was to use the _proper_ encoding=
parameter of the source() function. But where are they documented? Or
how do I find them in R itself? I ask because the proper encoding to
solve the problem was "UTF-8-BOM". I got this by reading the source
code to main/connnections.c . Not where I expect most people to go.

I found iconvlist(). But it does not list UTF-8-BOM, only UTF8 and
UTF-8. I got no useful response to ??BOM from the R prompt.

My normal locale is "C" on Linux. If I use encoding="UTF-8" in the
source() line, it fails because the BOM at the start is intepreted as
data to be processed. If I use UTF-8-BOM instead, it succeeds. It also
succeeds if I do Sys.setlocale("LC_ALL","en_US.utf8").

I admit that I don't understand all (or even much) of the ins-and-outs
of i10n, or code pages. But the UTF-8-BOM is just "weird" to me; and
confusing since it is not documented anywhere I can find.

-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! <><
John McKown



More information about the R-help mailing list