[Rd] R string comparisons may vary with platform (plain text)

Duncan Murdoch murdoch.duncan at gmail.com
Sat Nov 22 21:42:07 CET 2014


On 22/11/2014, 2:59 PM, Stuart Ambler wrote:
> A colleague¹s R program behaved differently when I ran it, and we thought
> we traced it probably to different results from string comparisons as
> below, with different R versions.  However the platforms also differed.  A
> friend ran it on a few machines and found that the comparison behavior
> didn¹t correlate with R version, but rather with platform.
> 
> I wonder if you¹ve seen this.  If it¹s not some setting I¹m unaware of,
> maybe someone should look into it.  Sorry I haven¹t taken the time to read
> the source code myself.

Looks like a collation order issue.  See ?Comparison.

Duncan Murdoch

> Thanks,
> Stuart
> 
> R version 3.0.2 (2013-09-25) -- "Frisbee Sailing"
> Platform: x86_64-unknown-linux-gnu (64-bit)
> Sys.getlocale()
> [1] 
> "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF
> -8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_
> NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICA
> TION=C"
> 
> "-1" > "1"
> [1] TRUE
> 
> "-1" <"1"
> [1] FALSE
> 
> "1" < "-1"
> [1] TRUE
> 
> "1" < "-"
> [1] FALSE
> 
> Vs.
> 
> R version 3.1.1 (2014-07-10) ‹ ³Sock it to Me"
> Platform: x86_64-redhat-linux-gnu (64-bit)
> Sys.getlocale()
> [1] 
> "LC_CTYPE=en_US.utf8;LC_NUMERIC=C;LC_TIME=en_US.utf8;LC_COLLATE=en_US.utf8
> ;LC_MONETARY=en_US.utf8;LC_MESSAGES=en_US.utf8;LC_PAPER=en_US.utf8;LC_NAME
> =C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.utf8;LC_IDENTIFICATION
> =C"
> 
> "-1" > "1"
> [1] FALSE
> 
> "-1" <"1"
> [1] TRUE
> 
> "1" < "-1"
> [1] FALSE
> 
> "1" < "-"
> [1] FALSE
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list