[Rd] Wanted: sort.data.frame

Kevin Wright kwright at eskimo.com
Wed Jul 21 18:25:54 CEST 2004


I've often wanted a function for sorting a data frame by multiple columns.
I know it's not too hard to do using the order function, but given the
frequency of questions about this on R-help, it would seem to me the task
could be simplified.

d = data.frame(x=c("A","D","A","C"),y=c(8,3,9,9),z=c(1,1,1,1))

My first attempt was something like
sort.data.frame(d, by=c(x,y))
which immediately failed since x is not an object.

I then decided something like this would be better
sort.data.frame(d, ~ x -y +z)
where + indicates ascending and - indicates descending.  
This ordering of the arguments seems natural to me, but in order to
be consistent with other functions that have formulae it would 
probably be better to use sort.data.frame(formula, data).

I spent an hour or so working on this and then admitted that manipulating
R formlas is not one of my stronger skills.  (I would have use a loop, and
then use substitute and eval(parse(text=...)) and would be embarassed
for anyone to see it.)

My personal feeling is that this function would be quite helpful and 
reduce
the frequency of sort questions on R-help.

This might be a nice, modest programming challenge for R gurus.

All contributors with such a function would have my respect.  

Kevin Wright



More information about the R-devel mailing list