[Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case

Tomas Kalibera tom@@@k@liber@ @ending from gm@il@com
Tue Jul 17 15:12:18 CEST 2018


Hi Kevin,

the extra bytes you are seeing are escapes for UTF-8 strings used in 
input to RGui console. Recently ascii strings are converted to UTF-8 so 
you would get these escapes for ascii strings now as well. RGui 
understands these escapes and converts from UTF-8 to wide characters 
before printing on Windows. The escapes should not be used unless 
printing to RGui console.

I suppose you managed to leak the escapes but I cannot reproduce, the 
example you sent seems incomplete ("x" not used, not clear what 
encoding.R is, not clear where the encodeString is run) and none of the 
variations I ran leaked the escapes on R-devel. Please clarify the 
example if you believe it is a bug. Please also use current R-devel 
(I've relatively recently fixed a bug in decoding these escaped strings, 
perhaps unlikely, but not impossible it could be related).

Best
Tomas

On 07/16/2018 10:01 PM, Kevin Ushey wrote:
> Given the following R script:
>
>     x <- 1
>     print(list())
>     save(x, file = tempfile())
>     output <- encodeString("apple")
>     print(output)
>
> If I source this script from RGui on Windows, I see the output:
>
>     > source("encoding.R")
>     list()
>     [1] "\002ÿþapple\003ÿþ"
>
> That is, it's as though R has injected what looks like byte order
> marks around the encoded string:
>
>     > charToRaw(output)
>      [1] 02 ff fe 61 70 70 6c 65 03 ff fe
>
> FWIW I see the same output in R-patched and R-devel. Any idea what
> might be going on? For what it's worth, I don't see the same issue
> with R as run from the terminal.
>
> Thanks,
> Kevin
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list