[R] sort() depends on locale

Duncan Murdoch murdoch.duncan at gmail.com
Sun Jun 15 13:16:55 CEST 2014


On 15/06/2014, 1:15 AM, Marius Hofert wrote:
> Hi,
> 
> If I use invisible(Sys.setlocale("LC_COLLATE", "C")) in ~/.Rprofile, then
> 
>> sort(c("L.Y", "Lu", "L.Q"))
> [1] "L.Q" "L.Y" "Lu"
> 
> whereas using invisible(Sys.setlocale("LC_COLLATE", "en_US.UTF-8")) results in
> 
>> sort(c("L.Y", "Lu", "L.Q"))
> [1] "L.Q" "Lu"  "L.Y"
> 
> I know this issue has appeared already
> (https://stat.ethz.ch/pipermail/r-help//2012-February/304089.html), I
> just don't see a reason for the second output: either '.' comes before
> letters, then the result should be
> "L.Q" "L.Y" "Lu" or it comes afterwards, then it should be "Lu" "L.Q"
> "L.Y" -- the above result thus seems inconsistent to any useful notion
> of 'sort' (?)

I don't see this either, but it appears that on your platform the "." is
simply being ignored, which might be a useful kind of sorting in some
contexts.

Duncan Murdoch



More information about the R-help mailing list