[R] about the function order()

Martin Maechler maechler at stat.math.ethz.ch
Tue Nov 27 15:49:25 CET 2001


>>>>> "Emmanuel" == Emmanuel Charpentier <charpent at bacbuc.dyndns.org> writes:

    Emmanuel> According to Thomas Lumley :

    >> In a sense it's doing the opposite of what you thought.

    >> The definition of order() is basically that
    >> a[order(a)]
    >> is in increasing order. This works with your example, where the correct
    >> order is the fourth, second, first, then third element.

    >> You may have been looking for rank(), which returns the rank of the
    >> elements
    /R> a <- c(4.1, 3.2, 6.1, 3.1) /
    />R> order(a) /
    >> [1] 4 2 1 3
    />R> rank(a) /
    >> [1] 3 2 4 1
    >> so rank() tells you what order the numbers are in, order() tells you how
    >> to get them in ascending order.

    Emmanuel> Hmmm ... meaning that order behaves like the
    Emmanuel> (gradeup) APL function, right ? (Yes, I'm *that*
    Emmanuel> old ...).

    Emmanuel> That should imply that, barring possible ties,
    Emmanuel> rank(x) == order(order(x)). Right ?

    Emmanuel> So : why distinc implementations ? Are there
    Emmanuel> efficiency considerations I'm missing ?

    Emmanuel> Or am I completely mistaken ?

no, only partly:

1) order(order(x)) is only the same as rank() when there are no ties :

 > set.seed(101); x <- round(runif(20),3) ; any(duplicated(x))
 [1] FALSE
 > rank(x)
  [1] 15 16 13 11  7 18  8 19 20 14  6 17  9  2 12  5  4 10  1  3
 > all(rank(x) == order(order(x)))
 [1] TRUE
ie. they *are* the same

 > x <- c(3, 4,2,4,2,5,2,1,5)
 > cbind(x = x, rankx = rank(x), oox = order(order(x)))
       x rankx oox
  [1,] 3   5.0   5
  [2,] 4   6.5   6
  [3,] 2   3.0   2
  [4,] 4   6.5   7
  [5,] 2   3.0   3
  [6,] 5   8.5   8
  [7,] 2   3.0   4
  [8,] 1   1.0   1
  [9,] 5   8.5   9
 > 

 So you see that rank(x) is really the mean of the corresponding
 oo(x) whenever x values are tied.


2) I used to be an APL afficionado myself: My first `real' (;-)
   computer language in high-school time.
   Efficiency considerations are quite a bit an issue.
   Iverson's APL came from the partly wrong idea that all we need
   in computing is just a bit more general than what math has
   been doing all along.  Hence all these monadic and dyadic
   operators which *did* have a beauty, but also *did* lead to
   contorted programming...
   The idea that you only needed new things if they were not
   available by a few compositions of APL operators lead to lots
   of inefficiencies.
   If there was no R (S), maybe I'd still be using APL
   occasionally (I'm kidding); I once was proud about a fast one-liner
   prime-number function.... ;-)

Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO D10	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list