R-alpha: Latin-1 characters / Locale etc.

Martin Maechler Martin Maechler <maechler@stat.math.ethz.ch>
Thu, 27 Nov 1997 10:35:06 +0100

>>>>> "PD" =3D=3D Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk> writes:

    PD> Ross Ihaka <ihaka@stat.auckland.ac.nz> writes:
    >>  >> ------------------------ >> R & R, any comments?  >>
    >> ------------------------
    >> At present the parser makes the decision on what characters can go
    >> into symbol names based on isalpha(c).  If someone will send me a
    >> function - say isidchar(c) which returns 1 for characters which can
    >> be in identifiers and 0 otherwise, I will replace the current test
    >> with that.
    >> Ross

Hmm, so we would follow the Unix locale philosophy.
I could live with it.

It has however, a distinct drawback:

You can write R code which works with R compiled in one environment but
fails with --identical R source code-- compiled in a different environment.

While this is true for things like 'readline' and 'proc.time / system.time'=
I don't like it so much for such a basic things as symbol characters.

    PD> Ahaaa... So the "oscillatory behaviour" is just me shifting between
    PD> machines with proper locale configuration and machines without it!
    PD> I think that isalpha() is actually the way to go. People just have
    PD> to get their locales right. Here's what's in isalpha(c)=3D=3D1 for =
    PD> da_DK locale:

    PD> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz=AD
    PD> =C0=C1=C2=C3=C4=C5=C6=C7=C8=C9=CA=CB=CC=CD=CE=CF=D0=D1=D2=D3=D4=D5=

    PD> The hyphen following 'z' is actually 0xad (soft hyphen).

In any case, I'd propose a new  
=09function 'alphachars()' 
and/or a global variable 
(or something better)
which returns a vector of nchar(1)-characters
giving the available symbols.

In=09../library/base/Alpha.Rd  (the accompanying help page),
all this would then be explained to users.

BTW, Peter D., do you have a (electronical form of a) document available
which nicely explains the  locale stuff (for a user, not a C-programmer ..)=
 Kurt/Fritz/???: I think there are some nice pages available in Linux.somet=

I'm still wondering:
The  only locale thing we have is (the environment variable)


But then I wonder why I saw the difference between  =E4 and =FC 
that I reported ....

- Martin=
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch