[R] mean of subset of rows

Jeffrey Robert Spies jspies at nd.edu
Mon Oct 1 18:42:50 CEST 2007


You were on the right track with the for loop, but often you can do  
the same thing looplessly (I know, it's not really a word) in R:

If your data is like this:

data<-data.frame(ID=rep(letters[1:4], 5), size=runif(20))

then apply either

tapply(data$size, data$ID, mean)

or

aggregate(data$size, list(data$ID), mean)

For further reference, section 4.2 in "An Introduction to R"  
describes using tapply in this way.

Jeff.

On Oct 1, 2007, at 11:57 AM, <darteta001 at ikasle.ehu.es>  
<darteta001 at ikasle.ehu.es> wrote:

> Dear list,
> this must be an easy one:
>
> I have a data.frame of two columns, "ID" with four different levels (A
> to D) and numerical "size", and each of the 4 different IDs is
> repeated a
> different number of times. I would like to get the mean size for each
> ID as another data.frame. I have tried the following:
>
>> ID= as.character(unique(data[,1])) # I use unique() because "data"
> will be larger in future
>> nIDs = length(ID)
>> for(i in 1:nIDs){
> +  subdata = subset(data,V1==ID[i])
> +  average = as.data.frame(cbind(1:i,ID[i],mean(subdata[,2]))
> + }
>
> Unfortunately, my output only gets the last level of ID four times:
>> average
>      V1 V2               V3
> 1  1  D 179.777777777778
> 2  2  D 179.777777777778
> 3  3  D 179.777777777778
> 4  4  D 179.777777777778
>
> How can I get what I need? there might be an easier way to do it, but
> I guess my skills aren´t that good. Any suggestions are welcome
>
> Regards,
>
> David
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list