R, LANG, and LC_....

Peter Dalgaard BSA p.dalgaard@biostat.ku.dk
08 Jan 1998 19:02:37 +0100


Martin Maechler <maechler@stat.math.ethz.ch> writes:

> =09setlocale(LC_ALL,"");
> by
>=20
> =09setlocale(LC_CTYPE,"");
> =09setlocale(LC_........);
> =09setlocale(LC_........);
>=20
> But probably you know better than me what the  ...... should be.

Hmm, it's not easy to say. LC_COLL is a rather obvious candidate, but
maybe not in all languages (Danish alphabet contains the English one
as the first 26 characters in their usual order, so English text would
sort the same, but does that hold for all languages?).

> Note that we still have the problem that R's parser does not properly =
use
> all the isalpha(.) characters when e.g. checking variable names.

Yes, this is strange (Solaris 2.5.1, LC_CTYPE=3Diso_8859_1):

> b=F8<-1
> =E6=F8=E5<-1
Error: syntax error
> =F8=E5<-1
> =E6<-1
Error in 0 <- 1 : invalid (do_set) left-hand side to assignment

apparently =E6 (0xf8) is treated differently from its siblings. It seems
to get treated syntactically as digit 0. Even stranger, if I set
LANG=3Dda and unset LC_CTYPE, I get a syntax error on everything with
an '=F8' in it (??!!!)

--=20
   O__  ---- Peter Dalgaard             Blegdamsvej 3 =20
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N  =20
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._