[Rd] sort infelicity

peter dalgaard pdalgd at gmail.com
Sun Jul 10 09:03:50 CEST 2011


On Jul 10, 2011, at 05:44 , Spencer Graves wrote:

> Hello:
> 
> 
>      sort(c('A', 'b', 'C')) seems to produce different answers in R interactive than in "R CMD check", at least under both Fedora 13 and Windows 7 with Windows 7 sessionInfo() copied below:
> 
> 
>      In interactive, the result is c('A', 'b', 'C');  with R CMD check, it is c('A', 'C', 'b').  This produced the infelicity of a bug in "R CMD check" that I could not replicate with interactive R because a *.Rd file contained the equivalent example of stopifnot(all.equal(sort(c('A', 'b', 'C')), c('A', 'b', 'C'))):  It worked just fine interactively but failed R CMD check.
> 
> 
>      Once I understood this problem, it was easy to fix.  However, it was not easy to find, especially since I got the same problem under Fedora 13 Linux and Windows 7.
> 
> 
>      This seems to be a sufficiently obscure anomaly that I thought someone might like to see it reported here.
> 

Well, the problem is here:
[snip]
> 
> locale:
> [1] LC_COLLATE=English_United States.1252
==========================================
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252

All checks in R (unless we overlooked some) run with LC_COLLATE=C, because otherwise they give different results in different locales. One notorious example is that people expect that a file or an object called "zzz" comes out last in a sort, but Estonian sorts "z" between "s" and "t"...

Notice that your .Rd example would, for the same reason, break for people with different locale settings.



-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list