[R] Problem with Windows clipboard and UTF-8

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Fri Sep 30 16:26:16 CEST 2022


Hello,

I can reproduce this.


C:\Users\ruipb>R -q -e  "writeClipboard('categoría'); sessionInfo()"
 > writeClipboard('categoría'); sessionInfo()
R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale:
[1] LC_COLLATE=Portuguese_Portugal.utf8  LC_CTYPE=Portuguese_Portugal.utf8
[3] LC_MONETARY=Portuguese_Portugal.utf8 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Portugal.utf8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.2.1

# quoting Andrew: Pasting the result into this e-mail message yields
categoría



And with the same sessionInfo() output


R -q -e  "writeClipboard('categoría', format = 13)"
# <Ctrl+V> paste clipboard here
categoría


Hope this helps,

Rui Barradas

Às 14:05 de 30/09/2022, Andrew Hart via R-help escreveu:
> Hi everyone,
> 
> Recently I upgraded to R 4.2.1 which now uses UTF-8 internally as its 
> native encoding. Very nice. However, I've discovered that if I use 
> writeClipboard to try and move a string containing accented characters 
> to the Windows clipboard and then try and paste that into another 
> application (e.g. notepad, Eclipse, etc.), the accents turn out all 
> garbled. Here's an example:
> 
> writeClipboard("categoría")
> Pasting the result into this e-mail message yields
> Categoría
> 
> As near as I can tell, the problem seems to have something to do with 
> the format parameter of writeClipboard. By default, format has a value 
> of 1, which tells the clipboard to receive Text in the machine's locale. 
> If I set format=13 in the call, the accents transfer to the clipboard 
> correctly:
> 
> writeClipboard("categoría", format=13)
> and the result is
> Categoría
> 
> It seems that format=13 may be a better default now that R is using 
> UTF-8. It would be nice not to have to specify the format every time I 
> want to copy text to the clipboard with writeClipboard.
> 
> Is writeClipboard supposed to perform any kind of encoding conversion or 
> is the format parameter merely informing the clipboard of the kind of 
> payload it's being handed?
> 
> Btw, with pre-4.2.0 versions of R, this wasn't a problem. I am very much 
> in favour of R using some kind of Unicode encoding natively, but this 
> wrinkle seems to be something the user shouldn't have to deal with since 
> the Windows clipboard is capable of holding Unicode text. Any advice 
> would be gratefully received.
> 
> Thanks,
>      Andrew.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list