[R] Danish characters i R2.0.1 vs R1.9.1 under winXP

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Nov 25 10:48:30 CET 2004


On Thu, 25 Nov 2004, Jean Coursol wrote:

> The same is true in french under linux.

No, it is not the same.  Windows XP does this even in the Danish locale, 
and that is a problem (in Windows).

Your problem is simply that you should be using a French locale to use 
French characters.  You should not expect French characters to be 
recognised in a non-French locale, and if they were, that was a bug in R 
1.9.1.

From ?Sys.setlocale

      The locale describes aspects of the internationalization of a
      program. Initially most aspects of the locale of R are set to
      '"C"' (which is the default for the C language and reflects
      North-American usage). R does set '"LC_CTYPE"' and '"LC_COLLATE"',
      which allow the use of a different character set (typically ISO
      Latin 1) and alphabetic comparisons in that character set
      (including the use of 'sort') ....

> Something changed
> from 1.9.1 to 2.0.0.

Yes, that has already be explained in this thread, so please read earlier 
replies.

To summarize: a bug has been corrected so R now works as it has always 
been documented to do, print()ing only the printable characters of the 
current locale.  Unfortunately for German and Scandinavian locales (at 
least), Windows XP does not correctly identify some of the characters in 
their locales as used in the locale.  As from 2.0.1 patched, we no longer 
believe Windows, but we do still believe other OSes.


> First, it is necessary to have .inputrc (in $HOME)
> (or $INPUTRC defined) to enter and display 8-bits
> characters under bash and R.

Enter from the console, yes, but not e.g. from a file.

> #.inputrc (for readline library)
> set input-meta on
> set output-meta on
> set convert-meta off
>
> Then
>
> Under R2.0.1, I have:

In what locale?  It matters!

>> élément <- "é"           # error for object name
> Error: syntax error
>> element <- "é"
>> element
> [1] "\351"                 # different from R1.9.1 (="é")
>
>> Sys.setlocale('LC_ALL','fr_FR')

The setting you need is LC_CTYPE: see the help page.

> [1] "fr_FR"
>> élément <- "é"           # OK for object name
>> élément
> [1] "é"                    # OK for display

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list