[R] rank(x,y)?

Gabor Grothendieck ggrothendieck at gmail.com
Thu Jun 22 04:02:33 CEST 2006


On 6/21/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> Peter Dalgaard wrote:
> > Duncan Murdoch <murdoch at stats.uwo.ca> writes:
> >
> >
> >> Suppose I have two columns, x,y.  I can use order(x,y) to calculate a
> >> permutation that puts them into increasing order of x,
> >> with ties broken by y.
> >>
> >> I'd like instead to calculate the rank of each pair under the same
> >> ordering, but the rank() function doesn't take multiple values
> >> as input.  Is there a simple way to get what I want?
> >>
> >> E.g.
> >>
> >>  > x <- c(1,2,3,4,1,2,3,4)
> >>  > y <- c(1,2,3,1,2,3,1,2)
> >>  > rank(x+y/10)
> >> [1] 1 3 6 7 2 4 5 8
> >>
> >> gives me the answer I want, but only because I know the range of y and
> >> the size of gaps in the x values.  What do I do in general?
> >>
> >
> > Still not quite general, but in the absence of ties:
> >
> >
> >> z[order(x,y)]<-1:8
> >> z
> >>
> > [1] 1 3 6 7 2 4 5 8
> >
> >
>
> Thanks to all who have replied.  Unfortunately for me, ties do exist,
> and I'd like them to get identical ranks.  John Fox's suggestion would
> handle ties properly, but I'm worried about rounding error giving
> spurious ties.
>

Try this variant of my prior solution:

(order(order(x,y)) + rev(order(order(rev(x), rev(y)))))/2

Note that no arithmetic is done on the original data, only on
the output of order, so there should not be any worry about
rounding -- in fact its sufficiently general that the data
do not have to be numeric, e.g.

> x <- c("a", "a", "b", "a", "c", "d")
> y <- c("b", "a", "b", "b", "a", "a")
> (order(order(x,y)) + rev(order(order(rev(x), rev(y)))))/2
[1] 2.5 1.0 4.0 2.5 5.0 6.0



More information about the R-help mailing list