[R] Simple order() data frame question.

Ivan Calandra ivan.calandra at uni-hamburg.de
Thu May 12 16:19:55 CEST 2011


I was wondering whether it would be possible to make a method for 
data.frame with sort().
I think it would be more intuitive than using the complex construction 
of df[order(df$a),]
Is there any reason not to make it?

Ivan

Le 5/12/2011 15:40, Marc Schwartz a écrit :
> On May 12, 2011, at 8:09 AM, John Kane wrote:
>
>> Argh.  I knew it was at least partly obvious.  I never have been able to read the order() help page and understand what it is saying.
>>
>> THanks very much.
>>
>> By the way, to me it is counter-intuitive that the the command is
>>
>>> df1[order(df1[,2],decreasing=TRUE),]
>> For some reason I keep expecting it to be
>> order( , df1[,2],decreasing=TRUE)
>>
>> So clearly I don't understand what is going on but at least I a lot better off.  I may be able to get this graph to work.
>
> John,
>
> Perhaps it may be helpful to understand that order() does not actually sort() the data.
>
> It returns a vector of indices into the data, where those indices are the sorted ordering of the elements in the vector, or in this case, the column.
>
> So you want the output of order() to be used within the brackets for the row *indices*, to reflect the ordering of the column (or columns in the case of a multi-level sort) that you wish to use to sort the data frame rows.
>
> set.seed(1)
> x<- sample(10)
>
>> x
>   [1]  3  4  5  7  2  8  9  6 10  1
>
>
> # sort() actually returns the sorted data
>> sort(x)
>   [1]  1  2  3  4  5  6  7  8  9 10
>
>
> # order() returns the indices of 'x' in sorted order
>> order(x)
>   [1] 10  5  1  2  3  8  4  6  7  9
>
>
> # This does the same thing as sort()
>> x[order(x)]
>   [1]  1  2  3  4  5  6  7  8  9 10
>
>
> set.seed(1)
> df1<- data.frame(aa = letters[1:10], bb = rnorm(10))
>
>> df1
>     aa         bb
> 1   a -0.6264538
> 2   b  0.1836433
> 3   c -0.8356286
> 4   d  1.5952808
> 5   e  0.3295078
> 6   f -0.8204684
> 7   g  0.4874291
> 8   h  0.7383247
> 9   i  0.5757814
> 10  j -0.3053884
>
>
> # These are the indices of df1$bb in sorted order
>> order(df1$bb)
>   [1]  3  6  1 10  2  5  7  9  8  4
>
>
> # Get df1$bb in increasing order
>> df1$bb[order(df1$bb)]
>   [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884  0.1836433  0.3295078
>   [7]  0.4874291  0.5757814  0.7383247  1.5952808
>
>
> # Same thing as above
>> sort(df1$bb)
>   [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884  0.1836433  0.3295078
>   [7]  0.4874291  0.5757814  0.7383247  1.5952808
>
>
> You can't use the output of sort() to sort the data frame rows, so you need to use order() to get the ordered indices and then use that to extract the data frame rows in the sort order that you desire:
>
>> df1[order(df1$bb), ]
>     aa         bb
> 3   c -0.8356286
> 6   f -0.8204684
> 1   a -0.6264538
> 10  j -0.3053884
> 2   b  0.1836433
> 5   e  0.3295078
> 7   g  0.4874291
> 9   i  0.5757814
> 8   h  0.7383247
> 4   d  1.5952808
>
>
>> df1[order(df1$bb, decreasing = TRUE), ]
>     aa         bb
> 4   d  1.5952808
> 8   h  0.7383247
> 9   i  0.5757814
> 7   g  0.4874291
> 5   e  0.3295078
> 2   b  0.1836433
> 10  j -0.3053884
> 1   a -0.6264538
> 6   f -0.8204684
> 3   c -0.8356286
>
>
> Does that help?
>
> Regards,
>
> Marc Schwartz
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calandra at uni-hamburg.de

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php



More information about the R-help mailing list