[Rd] rank(, ties.method="last")

Martin Maechler maechler at stat.math.ethz.ch
Wed Oct 21 18:06:50 CEST 2015


>>>>> Henric Winell <nilsson.henric at gmail.com>
>>>>>     on Wed, 21 Oct 2015 13:43:02 +0200 writes:

    > Den 2015-10-21 kl. 07:24, skrev Suharto Anggono Suharto Anggono via R-devel:
    >> Marius Hofert-4------------------------------
    >>> Den 2015-10-09 kl. 12:14, skrev Martin Maechler:
    >>> I think so: the code above doesn't seem to do the right thing.  Consider
    >>> the following example:
    >>> 
    >>> > x <- c(1, 1, 2, 3)
    >>> > rank2(x, ties.method = "last")
    >>> [1] 1 2 4 3
    >>> 
    >>> That doesn't look right to me -- I had expected
    >>> 
    >>> > rev(sort.list(x, decreasing = TRUE))
    >>> [1] 2 1 3 4
    >>> 
    >> 
    >> Indeed, well spotted, that seems to be correct.
    >> 
    >>> 
    >>> Henric Winell
    >>> 
    >> ------------------------------
    >> 
    >> In the particular example (of length 4), what is really wanted is the following.
    >> ind <- integer(4)
    >> ind[sort.list(x, decreasing=TRUE)] <- 4:1
    >> ind

    > You don't provide the output here, but 'ind' is, of course,

    >> ind
    > [1] 2 1 3 4

    >> The following gives the desired result:
    >> sort.list(rev(sort.list(x, decreasing=TRUE)))

    > And, again, no output, but

    >> sort.list(rev(sort.list(x, decreasing=TRUE)))
    > [1] 2 1 3 4

    > Why is it necessary to use 'sort.list' on the result from 
    > 'rev(sort.list(...'?

You can try all kind of code on this *too* simple example and do
experiments.  But let's approach this a bit more scientifically
and hence systematically:

Look at  rank  {the R function definition} to see that
for the case of no NA's,

 rank(x, ties.method = "first')   ===    sort.list(sort.list(x))

If you assume that to be correct and want to define "last" to be
correct as well (in the sense of being  "first"-consistent), 
it is clear that

  rank(x, ties.method = "last)   ===  rev(sort.list(sort.list(rev(x))))

must also be correct.  I don't think that *any* of the proposals
so far had a correct version [but the too simplistic examples
did not show the problems].

In  R-devel (the R development) version of today, i.e., svn
revision >= 69549, the implementation of  ties.method = "last'
uses
        ## == rev(sort.list(sort.list(rev(x)))) :
        if(length(x) == 0) integer(0)
        else { i <- length(x):1L
               sort.list(sort.list(x[i]))[i] },

which is equivalent to using rev() but a bit more efficient.

Martin Maechler, ETH Zurich



More information about the R-devel mailing list