[R] UTF-8 to the console
tom@@@k@||ber@ @end|ng |rom gm@||@com
Thu Oct 13 17:41:41 CEST 2022
thanks for the report, this is actually a bug in Rterm (or Windows, hard
to tell, but something that can be fixed in Rterm). More below
On 6/23/22 12:26, Helmut Schütz wrote:
> Dear all,
> I want to send UTF-8 characters to the console. Font in the
> GUI-Preference 'Lucida Console', supporting the desired symbols:
> greater than or equal: UTF-8 2265, HTML-entity ≥ HTML-Unicode
> ≥ TeX \ge
> approximately equal: UTF-8 2248, HTML-entity ≈ HTML-Unicode
> ≈ TeX \approx
> txt <- "x ≥ y, x \u2265 y; a ≈ b, a \u2248 b"
> Encoding(txt) <- "UTF-8"
>  "x = y, x = y; a \230 b, a \230 b"
> cat(txt, "\n")
> x = y, x = y; a ˜ b, a ˜ b
> Desired "x ≥ y, x ≥ y; a ≈ b, a ≈ b"
> I'm sending the email in UTF-8. Don’t know how @r-project.org is
> configured (ASCII?) If you see garbage, I'm sorry but you should get
> the idea.
> R 4.2.0 on Windows 7 (UCRT10.0.10240.16390) and Windows 11.
The underlying problem I can reproduce on my Windows 10 (which is almost
surely what you are seeing on Windows 11) is that characters ≥ and ≈
cannot be pasted to RTerm when running in cmd.exe or PowerShell. Pasting
these characters pastes nothing.
I've fixed this now in R-devel 83094 (and R-patched 83095). I would be
grateful if you (or anyone else) could test e.g. in R-patched, most
likely this example will work as it did for me, but also other examples
you can think of. Processing the input keys in Rterm/getline is very
tricky and brittle. What the code sees depends on what the console
implementation decides to do, and it differs for different console
implementations, and sadly this is not documented as far as I could find.
Now, the problem you reported does not happen in Msys2/mintty (so
Rtools42) terminal, because the terminal uses a different console
implementation. Also, the problem doesn't happen with the Windows
Terminal application, which has a yet different implementation. If you
ever needed a work-around to such problems, I would recommend trying the
Windows Terminal application.
The problem doesn't happen in Rgui, either, but that uses a different
code path completely on R end, indeed it does not run Rterm.
There is a key combination "Alt+I" you can press in RTerm, which will
switch to debug mode and will display the keyboard codes R receives (it
matches the sources in getline.c). When one sees different behavior of
things like your reported problem in with different console
implementations, it usually comes with different keyboard codes sent to R.
Your report has been very useful, thanks, and sorry for the long delay.
I would have spotted it earlier on R bugzilla (or R-devel) list.
More information about the R-help