[R] non-ascii characters in R output

Matt Shotwell matt at biostatmatt.com
Sat Feb 19 02:01:57 CET 2011



On Fri, 2011-02-18 at 19:50 -0500, Duncan Murdoch wrote:
> On 18/02/2011 5:58 PM, Matt Shotwell wrote:
> > OK, looks like my web browser does render non-ascii characters output by
> > R when it's given the encoding explicitly. This works for me:<meta
> > http-equiv="Content-Type" content="text/html; charset=UTF-8"/>. So
> > that's another solution, but not a general one.
> 
> I don't understand your final comment.  What is not general about 
> declaring how the file is encoded?

I meant that declaring UTF-8 is not generally applicable, because R
doesn't always output UTF-8 (right?). For example, locales that use
exotic encodings might output characters that are not interpretable
where UTF-8 is assumed.

The general solution, I suppose, is to automatically generate the
<meta /> line with the encoding used by R.

Matt

> 
> Duncan Murdoch
> 
> >
> > -Matt
> >
> > On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote:
> >> All,
> >>
> >> I'd like to automatically output text from R to HTML. In doing this I've
> >> run into trouble with non-ascii characters, as my browser (and
> >> presumably others) does not render such characters correctly. For
> >> example, the 'fancy' single quotes associated with summary.lm are
> >> multi-byte characters on my platform. This particular problem is solved
> >> by options(useFancyQuotes=FALSE). But now I'm concerned about other
> >> non-ascii characters. As an overkill maybe, my current solution involves
> >> capture.output and iconv(..., to="ASCII//TRANSLIT"). Are there other
> >> sources of non-ascii character? Is there a better or general solution?
> >>
> >> Best,
> >> Matt
> >>
> >>   >  sessionInfo()
> >> R version 2.12.1 (2010-12-16)
> >> Platform: x86_64-pc-linux-gnu (64-bit)
> >>
> >> locale:
> >>    [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> >>    [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> >>    [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
> >>    [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> >>    [9] LC_ADDRESS=C               LC_TELEPHONE=C
> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >>
> >> attached base packages:
> >> [1] stats     graphics  grDevices utils     datasets  methods   base
> >>
> >> loaded via a namespace (and not attached):
> >> [1] tools_2.12.1
> >>
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list