[Rd] Expected behaviour of is.unsorted?

Duncan Murdoch murdoch.duncan at gmail.com
Thu May 24 14:20:23 CEST 2012


On 12-05-24 7:39 AM, Matthew Dowle wrote:
> Duncan Murdoch<murdoch.duncan<at>  gmail.com>  writes:
>>
>> On 12-05-23 4:37 AM, Matthew Dowle wrote:
>>>
>>> Hi,
>>>
>>> I've read ?is.unsorted and searched. Have found a few items but nothing
>>> close, yet. Is the following expected?
>>>
>>>> is.unsorted(data.frame(1:2))
>>> [1] FALSE
>>>> is.unsorted(data.frame(2:1))
>>> [1] FALSE
>>>> is.unsorted(data.frame(1:2,3:4))
>>> [1] TRUE
>>>> is.unsorted(data.frame(2:1,4:3))
>>> [1] TRUE
>>>
>>> IIUC, is.unsorted is intended for atomic vectors only (description of x in
>>> ?is.unsorted). Indeed the C source (src/main/sort.c) contains an error
>>> message "only atomic vectors can be tested to be sorted". So that is the
>>> error message I expected to see in all cases above, since I know that
>>> data.frame is not an atomic vector. But there is also this in
>>> ?is.unsorted: "except for atomic vectors and objects with a class (where
>>> the>= or>   method is used)" which I don't understand. Where>= or>   is
>>> used by what, and where?
>>
>> If you look at the source, you will see that the basic test for classed
>> objects is
>>
>> all(x[-1L]>= x[-length(x)])
>>
>> (in the function base:::.gtn).
>>
>> This comparison doesn't really makes sense for dataframes, but it does
>> seem to be backwards:  that tests that x[2]>= x[1], x[3]>= x[2], etc.,
>> returning TRUE if all comparisons are TRUE:  but that sounds like it
>> should be is.sorted(), not is.unsorted().  Or is it my brain that is
>> backwards?
>
> Thanks. Yes you're right. So is.unsorted() on a data.frame is trying to tell us
> if there exists any unsorted row, it seems.

I would guess that it was never intended to be used this way.  It is 
intended for to test x[1] < x[2] < x[3] ... for objects where this is a 
sensible calculation; it isn't really sensible for dataframes.

>
>> DF = data.frame(a=c(1,3,5),b=c(1,3,5))
>> DF
>    a b
> 1 1 1               # this row is sorted
> 2 3 3               # this row is sorted
> 3 5 5               # this row is sorted
>> is.unsorted(DF)   # going by row but should be !.gtn
> [1] TRUE
>> with(DF,is.unsorted(order(a,b)))  # most people's natural expectation I guess
> [1] FALSE
>> DF[2,2]=2
>> DF
>    a b
> 1 1 1               # this row is sorted
> 2 3 2               # this row isn't sorted
> 3 5 5               # this row is sorted
>> is.unsorted(DF)   # going by row but should be !.gtn
> [1] FALSE
>> with(DF,is.unsorted(order(a,b)))  # most people's natural expectation I guess
> [1] FALSE
>
> Since it seems to have a bug anyway (and if so, can't be correct in anyone's
> use of it), could either is.unsorted on a data.frame return the error that's in
> the C code already: "only atomic vectors can be tested to be sorted", for
> safety and to lessen confusion, or be changed to return the natural expectation
> proposed above? The easiest quick fix would be to negate the result of the .gtn
> call of course, but then you could never go back.

I don't follow the last sentence.  If the .gtn call needs to be negated, 
why would you want to go back?

Duncan Murdoch

>
> Matthew
>
>> Duncan Murdoch
>>
>>>
>>> I understand why the first two are FALSE (1 item of anything must be
>>> sorted). I don't understand the 3rd and 4th cases where length is 2:
>>> do_isunsorted seems to call lang3(install(".gtn"), x, CADR(args))). Does
>>> that fall back to TRUE for some reason?
>>>
>>> Matthew
>>>
>>>> sessionInfo()
>>> R version 2.15.0 (2012-03-30)
>>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>>>
>>> locale:
>>> [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>>> Kingdom.1252
>>> [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>>> [5] LC_TIME=English_United Kingdom.1252
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] data.table_1.8.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tools_2.15.0
>>>
>>> ______________________________________________
>>> R-devel<at>  r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list