[R] having problems re-ordering a dataframe

Chabot Denis chabotd at globetrotter.net
Sat Oct 27 20:33:41 CEST 2007


Thank you very much for your detailed explanation and two solutions,  
Duncan.

Denis
Le 07-10-27 à 12:44, Duncan Murdoch a écrit :

> Chabot Denis wrote:
>> Dear R users,
>>
>> I need to reorder a dataframe using 3 variables for the determine  
>> the  sorting order.
>>
>> When I create a simple dataframe to test the method, things work  
>> as I  expected:
>>
>> a1 <- rep(1:10, each=8)
>> a2 <- rep(rep(1:2, each=4), 10)
>> a3 <- rep(c(1:4),20)
>> (a <- data.frame(a1, a2, a3))
>>
>> for each combination of a1 and a2, a3 is increasing
>>
>> t <- order(a$a1, a$a2, rev(a$a3))
>> b <- a[t,]
>>
> Using "rev(a$a3)" messes things up when the values of a3 aren't  
> identical in every (a1, a2) group.  If you know a3 is numeric with  
> no NA or NaN values, you could use
>
> t <- order(a$a1, a$a2, -a$a3)
>
> to get the sort order reversed.  If not, you need to do two sorts,  
> relying on the fact that order uses a stable sort:
>
> t1 <- order(a$a3, decreasing=TRUE)
> t2 <- with(a[t1,], order(a1, a2))
> t <- t1[t2]
> b <- a[t,]
>
> It would be nice if the decreasing arg to order was allowed to be a  
> vector, with entries applied in the obvious way, so that
>
> t <- with(a, order(a1, a2,  a3, decreasing = c(FALSE, FALSE, TRUE)))
>
> would work, but alas, it doesn't.
>
> Duncan Murdoch
>> In this new dataframe, the 3rd variable is in decreasing order  
>> within  each combination of a1 and a2, which is the desired result.
>> As expected, this still works if the second variable never changes:
>>
>> e1 <- rep(1:10, each=8)
>> e2 <- rep(2, 80)
>> e3 <- rep(c(1:8),10)
>> e <- data.frame(e1, e2, e3)
>> t <- order(e$e1, e$e2, rev(e$e3))
>> f <- e[t,]
>>
>> With my real data, I do not get the 3rd variable in inverse order  
>> for  each combination of the first 2. Here I recreate the  
>> beginning of my  dataframe for you to see what I mean:
>>
>> c1 <- c(rep(813,102), rep(826,48))
>> c2 <- rep(2,150)
>> c3 <- c(seq(1:102), seq(1,48))
>> c <- data.frame(c1, c2, c3)
>> t <- order(c$c1, c$c2, rev(c$c3))
>> d <- c[t,]
>>
>> The dataframe d is properly ordered for c1==826, but not for  
>> c1==813,  where c3 goes from 48 down to 1, and then from 102 down  
>> to 49,  instead of going from 102 down to 1. All the variables are  
>> numeric,  not factors.
>>
>> Any help/explanation will be greatly appreciated.
>>
>> Denis
>> sessionInfo()
>> R version 2.6.0 (2007-10-03)
>> powerpc-apple-darwin8.10.1
>>
>> locale:
>> fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting- 
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



More information about the R-help mailing list