[R] rank(x,y)?

Duncan Murdoch murdoch at stats.uwo.ca
Thu Jun 22 23:09:20 CEST 2006


Gabor Grothendieck wrote:
> On 6/21/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
>   
>> Peter Dalgaard wrote:
>>     
>>> Duncan Murdoch <murdoch at stats.uwo.ca> writes:
>>>
>>>
>>>       
>>>> Suppose I have two columns, x,y.  I can use order(x,y) to calculate a
>>>> permutation that puts them into increasing order of x,
>>>> with ties broken by y.
>>>>
>>>> I'd like instead to calculate the rank of each pair under the same
>>>> ordering, but the rank() function doesn't take multiple values
>>>> as input.  Is there a simple way to get what I want?
>>>>
>>>> E.g.
>>>>
>>>>  > x <- c(1,2,3,4,1,2,3,4)
>>>>  > y <- c(1,2,3,1,2,3,1,2)
>>>>  > rank(x+y/10)
>>>> [1] 1 3 6 7 2 4 5 8
>>>>
>>>> gives me the answer I want, but only because I know the range of y and
>>>> the size of gaps in the x values.  What do I do in general?
>>>>
>>>>         
>>> Still not quite general, but in the absence of ties:
>>>
>>>
>>>       
>>>> z[order(x,y)]<-1:8
>>>> z
>>>>
>>>>         
>>> [1] 1 3 6 7 2 4 5 8
>>>
>>>
>>>       
>> Thanks to all who have replied.  Unfortunately for me, ties do exist,
>> and I'd like them to get identical ranks.  John Fox's suggestion would
>> handle ties properly, but I'm worried about rounding error giving
>> spurious ties.
>>
>>     
>
> Try this variant of my prior solution:
>
> (order(order(x,y)) + rev(order(order(rev(x), rev(y)))))/2
>
> Note that no arithmetic is done on the original data, only on
> the output of order, so there should not be any worry about
> rounding -- in fact its sufficiently general that the data
> do not have to be numeric, e.g.
>
>   
>> x <- c("a", "a", "b", "a", "c", "d")
>> y <- c("b", "a", "b", "b", "a", "a")
>> (order(order(x,y)) + rev(order(order(rev(x), rev(y)))))/2
>>     
> [1] 2.5 1.0 4.0 2.5 5.0 6.0
>   

This is a very nice solution, thanks!

So now we have equivalents to ties="average" and "first"; ties="random" 
would be easy.  I wonder if it's worth working out ties="max" and 
ties="min" and putting in a new function?

Duncan Murdoch
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



More information about the R-help mailing list