[Rd] Encoding issues

Iñaki Ucar |uc@r @end|ng |rom |edor@project@org
Mon Feb 18 16:36:24 CET 2019


Hi,

We found a (to our eyes) strange behaviour that might be a bug. First
a little bit of context. The 'units' package allows us to set the unit
using both SE or NSE. E.g., these both work in the same way:

units::set_units(1:10, "μm")
#> Units: [μm]
#> [1]  1  2  3  4  5  6  7  8  9 10

units::set_units(1:10, μm)
#> Units: [μm]
#> [1]  1  2  3  4  5  6  7  8  9 10

That's micrometers, and works fine if the session charset is UTF-8.
Now the funny part comes with Windows. The first version, with quotes,
works fine, but the second one fails. This is easy to demonstrate from
Linux:

LC_CTYPE=en_US.iso88591 Rscript -e 'units::set_units(1:10, "μm")'
#> Units: [μm]
#> [1]  1  2  3  4  5  6  7  8  9 10

LC_CTYPE=en_US.iso88591 Rscript -e 'units::set_units(1:10, μm)'
#> Error: unexpected input in "units::set_units(1:10, μ"
#> Execution halted

However, if you use the first version, with quotes, in an example, and
the package is checked on Windows, it fails too (see
https://ci.appveyor.com/project/edzer/units/builds/22440023#L747). The
package declares UTF-8 encoding, so none of these errors should, in
principle, happen. Am I wrong?

Thanks in advance, regards,
Iñaki



More information about the R-devel mailing list