[R] about the function order()
Martin Maechler
maechler at stat.math.ethz.ch
Tue Nov 27 15:49:25 CET 2001
>>>>> "Emmanuel" == Emmanuel Charpentier <charpent at bacbuc.dyndns.org> writes:
Emmanuel> According to Thomas Lumley :
>> In a sense it's doing the opposite of what you thought.
>> The definition of order() is basically that
>> a[order(a)]
>> is in increasing order. This works with your example, where the correct
>> order is the fourth, second, first, then third element.
>> You may have been looking for rank(), which returns the rank of the
>> elements
/R> a <- c(4.1, 3.2, 6.1, 3.1) /
/>R> order(a) /
>> [1] 4 2 1 3
/>R> rank(a) /
>> [1] 3 2 4 1
>> so rank() tells you what order the numbers are in, order() tells you how
>> to get them in ascending order.
Emmanuel> Hmmm ... meaning that order behaves like the
Emmanuel> (gradeup) APL function, right ? (Yes, I'm *that*
Emmanuel> old ...).
Emmanuel> That should imply that, barring possible ties,
Emmanuel> rank(x) == order(order(x)). Right ?
Emmanuel> So : why distinc implementations ? Are there
Emmanuel> efficiency considerations I'm missing ?
Emmanuel> Or am I completely mistaken ?
no, only partly:
1) order(order(x)) is only the same as rank() when there are no ties :
> set.seed(101); x <- round(runif(20),3) ; any(duplicated(x))
[1] FALSE
> rank(x)
[1] 15 16 13 11 7 18 8 19 20 14 6 17 9 2 12 5 4 10 1 3
> all(rank(x) == order(order(x)))
[1] TRUE
ie. they *are* the same
> x <- c(3, 4,2,4,2,5,2,1,5)
> cbind(x = x, rankx = rank(x), oox = order(order(x)))
x rankx oox
[1,] 3 5.0 5
[2,] 4 6.5 6
[3,] 2 3.0 2
[4,] 4 6.5 7
[5,] 2 3.0 3
[6,] 5 8.5 8
[7,] 2 3.0 4
[8,] 1 1.0 1
[9,] 5 8.5 9
>
So you see that rank(x) is really the mean of the corresponding
oo(x) whenever x values are tied.
2) I used to be an APL afficionado myself: My first `real' (;-)
computer language in high-school time.
Efficiency considerations are quite a bit an issue.
Iverson's APL came from the partly wrong idea that all we need
in computing is just a bit more general than what math has
been doing all along. Hence all these monadic and dyadic
operators which *did* have a beauty, but also *did* lead to
contorted programming...
The idea that you only needed new things if they were not
available by a few compositions of APL operators lead to lots
of inefficiencies.
If there was no R (S), maybe I'd still be using APL
occasionally (I'm kidding); I once was proud about a fast one-liner
prime-number function.... ;-)
Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum LEO D10 Leonhardstr. 27
ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND
phone: x-41-1-632-3408 fax: ...-1228 <><
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list