[Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case

Kevin Ushey kevinu@hey @ending from gm@il@com
Wed Jul 18 16:48:44 CEST 2018


Thank you for the quick fix! I could've sworn the 'save()' dance was a
necessary part of the reproducible example, but evidently not ...
On Wed, Jul 18, 2018 at 6:38 AM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>
> Fixed in R-devel and R-patched,
> Tomas
>
> On 07/18/2018 12:03 PM, Tomas Kalibera wrote:
>
> Thanks, I can now reproduce and it is a bug that is easy to fix, I will do so shortly.
>
> Fyi it can be reproduced simply by running these two lines in Rgui:
>
> list()
> encodeString("apple")
>
> Best
> Tomas
>
> On 07/17/2018 05:16 PM, Kevin Ushey wrote:
>
> Sorry, I should have been more clear -- if I write the contents of
> that script to a file called 'encoding.R' and source that, then I see
> the reported behavior.
>
> Here's something standalone that you should hopefully be able to copy
> + paste into RGui to reproduce:
>
> code <- '
>    x <- 1
>    print(list())
>    save(x, file = tempfile())
>    output <- encodeString("apple")
>    print(output)
> '
>
> file <- tempfile(fileext = ".R")
> writeLines(code, con = file)
> source(file)
>
> When I run this, I see:
>
> code <- '
>
> +    x <- 1
> +    print(list())
> +    save(x, file = tempfile())
> +    output <- encodeString("apple")
> +    print(output)
> + '
>
> file <- tempfile(fileext = ".R")
> writeLines(code, con = file)
> source(file)
>
> list()
> [1] "\002ÿþapple\003ÿþ"
>
> This is with today's R-devel:
>
> sessionInfo()
>
> R Under development (unstable) (2018-07-16 r74967)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 17134)
>
> Matrix products: default
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_3.6.0
>
> I realize the example looks incomplete, but it seems like each step is
> required to reproduce the strange behavior:
>
>    1) You need to print an empty list,
>    2) You need to invoke save() after printing that empty list,
>    3) Then, attempts to call encodeString() will produce the strange output.
>
> For what it's worth, it may be related to a behavior I'm seeing where
> the first name printed for an R list is quoted with backticks even
> when not necessary:
>
> list(x = 1, y = 2)
>
> $`x`
> [1] 1
>
> $y
> [1] 2
>
> Thanks,
> Kevin
>
> On Tue, Jul 17, 2018 at 6:12 AM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>
> Hi Kevin,
>
> the extra bytes you are seeing are escapes for UTF-8 strings used in
> input to RGui console. Recently ascii strings are converted to UTF-8 so
> you would get these escapes for ascii strings now as well. RGui
> understands these escapes and converts from UTF-8 to wide characters
> before printing on Windows. The escapes should not be used unless
> printing to RGui console.
>
> I suppose you managed to leak the escapes but I cannot reproduce, the
> example you sent seems incomplete ("x" not used, not clear what
> encoding.R is, not clear where the encodeString is run) and none of the
> variations I ran leaked the escapes on R-devel. Please clarify the
> example if you believe it is a bug. Please also use current R-devel
> (I've relatively recently fixed a bug in decoding these escaped strings,
> perhaps unlikely, but not impossible it could be related).
>
> Best
> Tomas
>
> On 07/16/2018 10:01 PM, Kevin Ushey wrote:
>
> Given the following R script:
>
>     x <- 1
>     print(list())
>     save(x, file = tempfile())
>     output <- encodeString("apple")
>     print(output)
>
> If I source this script from RGui on Windows, I see the output:
>
>     > source("encoding.R")
>     list()
>     [1] "\002ÿþapple\003ÿþ"
>
> That is, it's as though R has injected what looks like byte order
> marks around the encoded string:
>
>     > charToRaw(output)
>      [1] 02 ff fe 61 70 70 6c 65 03 ff fe
>
> FWIW I see the same output in R-patched and R-devel. Any idea what
> might be going on? For what it's worth, I don't see the same issue
> with R as run from the terminal.
>
> Thanks,
> Kevin
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>



More information about the R-devel mailing list