[R] confirming behavior of "by"

Daryl Morris darylm at uw.edu
Tue Sep 28 21:54:19 CEST 2010


  Hi,

I'm using "by" to summarize by multiple groups, and want to extract the 
returned into a pretty dataframe.  I'm trying to find a simple way to 
name the rows of the data frame.  I'd like it to be something like 
index1.val1.index2.val2 where the index1 and index2 are the names of the 
indices and the val1 & val2 are names of possible values of the index.  
(the calling function will do a bit more processing)

I had thought to use attr(byOut,"dimnames") for this, but the author of 
"by" chose to output that as a string rather than as a vector... and I'm 
too lazy to figure out parsing that at this point.  I'm thinking it's 
probably easier to determine the order external to "by".

Finally ... my question ... the help for "by" says: "A data frame is 
split by row into data frames subsetted by the values of one or more 
factors".   Should I infer from this that the elements are factorized?  
And that the order of the rows would be the same as if we did factor, 
with the default options (ie alphabetical)?  Further, is that applied 
iteratively, with each subgroup broken into the factors for the 
remaining indices (which would have the order as if they were 
factorized)?  And that the order of the data has no bearing on the order 
of the results?

hopefully that makes sense.

or if someone else has a better way of getting the job done?

thanks, Daryl Morris
FHCRC, UW



More information about the R-help mailing list