[R] Function rank() for data frames (or multiple vectors)?

peter dalgaard pdalgd at gmail.com
Mon Aug 29 19:56:19 CEST 2011


On Aug 29, 2011, at 15:39 , Sebastian Bauer wrote:

>> 
>> > rr <- data.frame(a = c(1,1,1,1,2), b=c(1,2,2,3,1))
>> 
>> > ave(order(rr$a, rr$b), rr$a, rr$b )
>> [1] 1.0 2.5 2.5 4.0 5.0
> 
> Actually, this may be a solution I was looking for! Note that it assumes that rr to be sorted already (hence the first argument of ave could be simply 1:nrow(rr)). Also, by using FUN=min or FUN=max I can cover the other cases. Thanks for this!
> 

Yes, order() and rank() are different beasts so you'd need the presort.

You might consider this:

> rr <- data.frame(a = c(1,1,1,2,2), b=c(2,2,1,3,1))
> rr
  a b
1 1 2
2 1 2
3 1 1
4 2 3
5 2 1

> ave(order(rr$a, rr$b), rr$a, rr$b ) #WORNG!
[1] 2 2 2 5 4
> ave(order(order(rr$a, rr$b)), rr$a, rr$b )
[1] 2.5 2.5 1.0 5.0 4.0

Figuring out why order(order(x)) == rank(x) if you ignore ties is "left as an exercise" (i.e., I can't recall the argument just now...). 


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
"Døden skal tape!" --- Nordahl Grieg



More information about the R-help mailing list