[Rd] packages using UTF-8 encoding

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jul 9 15:23:17 CEST 2007

On Mon, 9 Jul 2007, Sebastian P. Luque wrote:

> Hi,
> During a recent CRAN upload procedure, I was reminded of the following
> regarding R-devel:
>    o   R CMD check now warns on non-ASCII .Rd files without an
>        \encoding field, rather than just on ones that are definitely
>        not from an ISO-8859 encoding.  This agrees with the
>        long-standing stipulation in 'Writing R Extensions', and
>        catches some packages with UTF-8 man pages.
>    o   R CMD check now warns on DESCRIPTION files with a non-portable
>        Encoding field, or with non-ASCII data and no Encoding field.
> So if we need UTF-8 encoding for the DESCRIPTION and *.Rd files, would it
> be sufficient to have an "Encoding: UTF-8" line in the former and a
> "\encoding{UTF-*}" in the latter?  Thanks.

\encoding(UTF-8}, yes.

However, I would be worried about anything which 'needs' UTF-8 encoding, 
since it is going to be far from portable.  If all you need are characters 
from ISO-8859-1, it is more portable to use that.

My current understanding is that the internationalization/charset support
works well on Windows, MacOS X, systems using glibc (e.g. Linux), FreeBSD 
and some commercial Unixen.  The thing that does not work well is fonts to 
display the non-ASCII characters, where glyphs are often silently omitted.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-devel mailing list