[Rd] Problem with order() and I()

peter dalgaard pdalgd at gmail.com
Wed Sep 10 00:54:12 CEST 2014


[This is the note I alluded to earlier today.]

On 08 Sep 2014, at 18:06 , MacQueen, Don <macqueen1 at llnl.gov> wrote:

> I have found that order() fails in a rather arcane circumstance, as in
> this example:
> 
>> foo <- I( c('x','\265g') )
>> order(foo)
> Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed
>> foo <-c('x','\265g')
>> order(foo)
> [1] 1 2
> 
> 

The oddity is really that it works (for some value of "works") in the unclassed case:

> foo <- I( c('x','\265g') )
> order(foo)
Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed
> foo[[1]]
[1] "x"
> foo[[2]]
[1] "\xb5g"
> foo[[1]] < foo[[2]]
[1] NA
> foo[[1]] > foo[[2]]
[1] NA


> fee <- c('x','\265g') 
> fee[[1]]
[1] "x"
> fee[[2]]
[1] "\xb5g"
> fee[[1]] < fee[[2]]
[1] NA
> fee[[1]] > fee[[2]]
[1] NA
> order(fee)
[1] 2 1

Notice that the unclassed `fee` has exactly the same issue that its elements are incomparable as `foo` does.

The thing is that xtfrm.AsIs will use elementwise comparison, whereas xtfrm.default will use rank(), which somehow manages to do something with character vectors for which the sort order is undefined:

> rank(foo)
Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed
> rank(fee)
[1] 2 1

(Notice that xtfrm calls rank and vice versa, presumably without creating a loop. I gave up on sorting out the logic.)

> 
>> sessionInfo()
> R version 3.1.1 (2014-07-10)
> Platform: x86_64-apple-darwin13.1.0 (64-bit)
> 
> locale:
> [1] C
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> Thanks
> -Don
> 
> p.s.
> Just a little background, irrelevant unless one wonders why I¹m using I()
> and \265:
> 
> If I were writing new code I wouldn¹t be using I(), since there are better
> ways now to achieve the same end (preventing the creation of factors in
> data frames), but the scripts that use it are quite old,  originally
> developed in 2001.
> 
> In at least some but perhaps limited contexts, Œ\265¹ produces the greek
> letter mu, and that¹s why I¹m using it. And if I remember correctly, 2001
> is prior to the current R support for locales and extended character sets.
> Using \265 is what I could find at that time to get a mu into my output.
> 
> I came across this while checking some things; it¹s not actually breaking
> my scripts, so I doubt it¹s due to any recent change.
> 
> 
> -- 
> Don MacQueen
> 
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list