[Rd] multiple issues with is.unsorted()

Hervé Pagès hpages at fhcrc.org
Wed Apr 24 21:00:39 CEST 2013


Hi,

On 04/24/2013 09:27 AM, William Dunlap wrote:
>>     >>> is.unsorted(NA)
>>     >> [1] NA
>>     >> => Contradicts "all objects of length 0 or 1 are sorted".
>>
>> Ok.  I really think we should change the above.
>> If NA is for a missing number, it still cannot be unsorted if it
>> is of length one.
>>
>> --> the above will give FALSE  "real soon now".
>
> It depends what you are using the result of is.unsorted() for.  If you want
> to know if you can save time by not calling x<-sort(x)  then is.unsorted(NA)
> should not say that NA is sorted, as sort(NA) has length 0.

Glad you mention this. This is related but actually a different issue
which is that by default is.unsorted() and sort() don't treat NAs
consistently: the former keeps them, the latter removes them. So if
you want to use is.unsorted() for deciding whether or not you're going
to call sort() (without specifying 'na.last'), you should do
'is.unsorted( , na.rm=TRUE)'.

This is why IMO 'is.unsorted( , na.rm=TRUE)' is an important use case
and should be as fast as possible.

If you want to keep NAs, you'll have to sort 'x' with either
na.last=TRUE or na.last=FALSE. So it makes a lot of sense that
is.unsorted(x) returns FALSE if x is a single NA, because, in that
case, 'x' doesn't need to be sorted.

Cheers,
H.

>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf
>> Of Martin Maechler
>> Sent: Wednesday, April 24, 2013 8:41 AM
>> To: Hervé Pagès; R-devel at stat.math.ethz.ch
>> Cc: Martin Maechler
>> Subject: Re: [Rd] multiple issues with is.unsorted()
>>
>> More comments .. see inline
>>
>>>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>>>      on Wed, 24 Apr 2013 11:29:39 +0200 writes:
>>
>>      > Dear Herve,
>>>>>>> Hervé Pagès <hpages at fhcrc.org>
>>>>>>> on Tue, 23 Apr 2013 23:09:21 -0700 writes:
>>
>>      >> Hi, In the man page for is.unsorted():
>>
>>      >> Value:
>>
>>      >> A length-one logical value.  All objects of length 0 or 1
>>      >> are sorted: the result will be ‘NA’ for objects of length
>>      >> 2 or more except for atomic vectors and objects with a
>>      >> class (where the ‘>=’ or ‘>’ method is used to compare
>>      >> ‘x[i]’ with ‘x[i-1]’ for ‘i’ in ‘2:length(x)’).
>>
>>      >> This contains many incorrect statements:
>>
>>      >>> length(NA)
>>      >> [1] 1
>>      >>> is.unsorted(NA)
>>      >> [1] NA
>>      >>> length(list(NA))
>>      >> [1] 1
>>      >>> is.unsorted(list(NA))
>>      >> [1] NA
>>
>>      >> => Contradicts "all objects of length 0 or 1 are sorted".
>>
>> Ok.  I really think we should change the above.
>> If NA is for a missing number, it still cannot be unsorted if it
>> is of length one.
>>
>> --> the above will give FALSE  "real soon now".
>>
>>      >>> is.unsorted(raw(2))
>>      >> Error in is.unsorted(raw(2)) : unimplemented type 'raw'
>>      >> in 'isUnsorted'
>>
>>      >> => Doesn't agree with the doc (unless "except for atomic
>>      >> vectors" means "it might fail for atomic vectors").
>>
>> Well, the doc says about 'x'
>> |  \item{x}{an \R object with a class or a numeric, complex, character or
>> |    logical vector.}
>> so strictly, is.unsorted() is not to be used on raw vectors.
>>
>> However I think you have a point:
>> Raw vectors didn't exist when  is.unsorted()  was
>> invented, so where not considered back then.
>> Originally,  raw vectors were really almost only there for
>> storage, i.e. basically read and write, but now we have
>> as '<' , '<=' '=='  etc  working well for raw() ,
>> we could allow  is.unsorted() to work, too.
>>
>> Note however, that if you try to sort(<raw>) you also always get
>> an error about sort() not being implemented for raw(),...
>> something we could arguably reconsider, as we admitted the
>> relational operators (< <= == >= >  != ) to work.
>> {{anyone donating patches to R-devel for sort()ing raw ?}}
>>
>>
>>      >>> setClass("A", representation(aa="integer"))
>>      >>> new("A", aa=4:1)
>>      >>> length(a)
>>      >> [1] 1
>>
>>      >>> is.unsorted(a)
>>      >> [1] FALSE
>>      >>  Warning message: In is.na(x) : is.na() applied
>>      >> to non-(list or vector) of type 'S4'
>>
>>      >> => Ok, but it's arguable the warning is useful/justified
>>      >> from a user point of view. The warning *seems* to suggest
>>      >> that defining an "is.na" method for my objects is
>>      >> required for is.unsorted() to work properly but the doc
>>      >> doesn't make this clear.
>>
>> you are right.
>> We are going to improve on this, at least the documentation.
>>
>>
>> [.................]
>>
>> The S4 part I've already started addressing in the last reply.
>> (and we may get back to that.. )
>>
>> [.................]
>>
>> Martin
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list