[Rd] Problem with order() and I()

peter dalgaard pdalgd at gmail.com
Tue Sep 9 16:36:19 CEST 2014


It's actually a little more complicated. I wrote a note, but it seems to be stuck in the outbox on my home machine (I probably forgot to click Send...). 

One important aspect is that

> "x" < "\265g"
[1] NA

which makes me wonder if the bug really is in the case that "works". It seems that it is possible to rank() character vectors that contain incomparable elements.

-pd

On 09 Sep 2014, at 16:19 , Martin Maechler <maechler at stat.math.ethz.ch> wrote:

>>>>>> MacQueen, Don <macqueen1 at llnl.gov>
>>>>>>    on Mon, 8 Sep 2014 16:06:21 +0000 writes:
> 
>> I have found that order() fails in a rather arcane circumstance, as in
>> this example:
> 
>>> foo <- I( c('x','\265g') )
>>> order(foo)
>> Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed
> 
>>> foo <-c('x','\265g')
>>> order(foo)
>> [1] 1 2
> 
> yes, this is not desirable.
> order() in such cases calls xtfrm()  {as documented}
> and that ends up calling rank() and then the internal  .gt()
> where the bug happens because
> 
>> I("x") > I("\xb5g")
> [1] NA
> 
> but really I think the change should happen in xtfrm.Asis(.)
> which I think should drop the class also in this case.
> 
> More on this, once we have fixed it.
> 
> Thank you, Don, very much!
> 
> Martin Maechler,
> ETH Zurich
> 
>>> sessionInfo()
>> R version 3.1.1 (2014-07-10)
>> Platform: x86_64-apple-darwin13.1.0 (64-bit)
> 
>> locale:
>> [1] C
> 
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
>> Thanks
>> -Don
> 
>> p.s.
>> Just a little background, irrelevant unless one wonders why I¹m using I()
>> and \265:
> 
>> If I were writing new code I wouldn¹t be using I(), since there are better
>> ways now to achieve the same end (preventing the creation of factors in
>> data frames), but the scripts that use it are quite old,  originally
>> developed in 2001.
> 
>> In at least some but perhaps limited contexts, Œ\265¹ produces the greek
>> letter mu, and that¹s why I¹m using it. And if I remember correctly, 2001
>> is prior to the current R support for locales and extended character sets.
>> Using \265 is what I could find at that time to get a mu into my output.
> 
>> I came across this while checking some things; it¹s not actually breaking
>> my scripts, so I doubt it¹s due to any recent change.
> 
> 
>> -- 
>> Don MacQueen
> 
>> Lawrence Livermore National Laboratory
>> 7000 East Ave., L-627
>> Livermore, CA 94550
>> 925-423-1062
> 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list