[Rd] Differences in printing UTF-8 strings to stdout vs. stderr under Windows

Tue Sep 22 15:32:09 CEST 2015

On 22/09/2015 8:55 AM, Richard Cotton wrote:
> It seems that under Windows, some UTF-8 strings that print OK to
> stdout do not print correctly to stderr.  To reproduce:
> 
> x <- "\ub124"
> cat(x, file = stdout())
> ## 네
> cat(x, file = stderr())
> ## <U+B124>
> 
> Original motivating problem here:
> https://stackoverflow.com/questions/32696241/how-to-display-a-message-warning-error-with-unicode-characters-under-windows
> 
> How does printing to stderr differ from printing to stdout?
> 
> And more importantly, is there any way I can ensure correct printing
> of Unicode characters when I need to write to stderr (when throwing
> errors or warnings)?
> 

The answer to the last question is certainly "no".  If you are running
Rterm, you are limited to printing in the native encoding, and that
won't cover all Unicode characters.

In Rgui we make more of an effort to convert characters to UTF-16, which
covers most Unicode characters.  (I think we still don't handle
surrogate pairs, but they're rarely used.)  Apparently this is done for
stdout() but not for stderr().  I don't know the rationale for that design.

Duncan Murdoch