[R] having problems re-ordering a dataframe

Duncan Murdoch murdoch at stats.uwo.ca
Sat Oct 27 18:44:40 CEST 2007


Chabot Denis wrote:
> Dear R users,
>
> I need to reorder a dataframe using 3 variables for the determine the  
> sorting order.
>
> When I create a simple dataframe to test the method, things work as I  
> expected:
>
> a1 <- rep(1:10, each=8)
> a2 <- rep(rep(1:2, each=4), 10)
> a3 <- rep(c(1:4),20)
> (a <- data.frame(a1, a2, a3))
>
> for each combination of a1 and a2, a3 is increasing
>
> t <- order(a$a1, a$a2, rev(a$a3))
> b <- a[t,]
>   
Using "rev(a$a3)" messes things up when the values of a3 aren't 
identical in every (a1, a2) group.  If you know a3 is numeric with no NA 
or NaN values, you could use

t <- order(a$a1, a$a2, -a$a3)

to get the sort order reversed.  If not, you need to do two sorts, 
relying on the fact that order uses a stable sort:

t1 <- order(a$a3, decreasing=TRUE)
t2 <- with(a[t1,], order(a1, a2))
t <- t1[t2]
b <- a[t,]

It would be nice if the decreasing arg to order was allowed to be a 
vector, with entries applied in the obvious way, so that

t <- with(a, order(a1, a2,  a3, decreasing = c(FALSE, FALSE, TRUE)))

would work, but alas, it doesn't.

Duncan Murdoch
> In this new dataframe, the 3rd variable is in decreasing order within  
> each combination of a1 and a2, which is the desired result.
> As expected, this still works if the second variable never changes:
>
> e1 <- rep(1:10, each=8)
> e2 <- rep(2, 80)
> e3 <- rep(c(1:8),10)
> e <- data.frame(e1, e2, e3)
> t <- order(e$e1, e$e2, rev(e$e3))
> f <- e[t,]
>
> With my real data, I do not get the 3rd variable in inverse order for  
> each combination of the first 2. Here I recreate the beginning of my  
> dataframe for you to see what I mean:
>
> c1 <- c(rep(813,102), rep(826,48))
> c2 <- rep(2,150)
> c3 <- c(seq(1:102), seq(1,48))
> c <- data.frame(c1, c2, c3)
> t <- order(c$c1, c$c2, rev(c$c3))
> d <- c[t,]
>
> The dataframe d is properly ordered for c1==826, but not for c1==813,  
> where c3 goes from 48 down to 1, and then from 102 down to 49,  
> instead of going from 102 down to 1. All the variables are numeric,  
> not factors.
>
> Any help/explanation will be greatly appreciated.
>
> Denis
> sessionInfo()
> R version 2.6.0 (2007-10-03)
> powerpc-apple-darwin8.10.1
>
> locale:
> fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list