[Rd] Suggestion to extend aggregate() to return multiple and/or named values

Mike Lawrence Mike.Lawrence at DAL.CA
Fri Jul 13 18:29:37 CEST 2007

Hi all,

This is my first post to the developers list. As I understand it,  
aggregate() currently repeats a function across cells in a dataframe  
but is only able to handle functions with single value returns.  
Aggregate() also lacks the ability to retain the names given to the  
returned value. I've created an agg() function (pasted below) that is  
apparently backwards compatible (i.e. returns identical results as  
aggregate() if the function returns a single unnamed value), but is  
able to handle named and/or multiple return values. The code may be a  
little inefficient (there must be an easier way to set up the 'temp'  
data frame than to call aggregate and remove the final column), but  
I'm suggesting that something similar to this may be profitably used  
to replace aggregate entirely.

#modified aggregate command, allowing for multiple/named output values
	temp=aggregate(z,Ind,length) #dummy data frame
	temp=temp[,c(1:(length(temp)-1))] #remove last column from dummy frame
	for(i in 1:num.dv){
		for(j in 1:num.cells){

#create some factored data
z=rnorm(100) # the DV
A=rep(1:2,each=25,2) #one factor
B=rep(1:2,each=50) #another factor
Ind=list(A=A,B=B) #the factor list

aggregate(z,Ind,mean) #show the means of each cell
agg(z,Ind,mean) #should be identical to aggregate

aggregate(z,Ind,summary) #returns an error
agg(z,Ind,summary) #returns named columns

#Make a function that returns multiple unnamed values
agg(z,Ind,summary2) #returns multiple columns, default names

Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://memetic.ca

Public calendar: http://icalx.com/public/informavore/Public

"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
	- Piet Hein

More information about the R-devel mailing list