[R] I don't understand the 'order' function

Wed Apr 17 11:52:58 CEST 2013

On Apr 17, 2013, at 10:41 , Patrick Burns wrote:

> There is a blog post about this:
> 
> http://www.portfolioprobe.com/2012/07/26/r-inferno-ism-order-is-not-rank/
> 
> And proof that it is possible to confuse them
> even when you know the difference.

It usually helps to remember that x[order(x)] is sort(x)  (and that x[rank(x)] is nothing of the sort).

It's somewhat elusive, but not impossible to realize that the two are inverses (if no ties). Duncan M. indicated it nicely earlier in the thread: rank() is how to permute ordered data to get those observed, order is how to permute the data to put them in order. 

They are inverses in terms of composition of permutations, not as transformations of sets of integers: rank(order(x)) and order(rank(x)) are both equal to order(x), whereas

> x <- rnorm(5)
> rank(x)
[1] 4 3 5 2 1
> order(x)
[1] 5 4 2 1 3
> ## permutation matrix
> P <- matrix(0,5,5); diag(P[,order(x)]) <- 1
> P %*% 1:5
     [,1]
[1,]    5
[2,]    4
[3,]    2
[4,]    1
[5,]    3
> P2 <- matrix(0,5,5); diag(P2[,rank(x)]) <- 1
> P2 %*% 1:5
     [,1]
[1,]    4
[2,]    3
[3,]    5
[4,]    2
[5,]    1
> P %*% P2
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    0    0    0    0
[2,]    0    1    0    0    0
[3,]    0    0    1    0    0
[4,]    0    0    0    1    0
[5,]    0    0    0    0    1

Or, as Duncan put it: 
rank(x)[order(x)]  and order(x)[rank(x)] are 1:length(x).

The thing that tends to blow my mind is that order(order(x))==rank(x). I can't seem to get my intuition to fathom it, although there's a fairly easy proof in that 

1:N == sort(order(x)) == order(x)[order(order(x))] == order(x)[rank(x)]

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com