[R] Problems with aggregate() function in stats package

Marc Schwartz marc_schwartz at me.com
Thu Sep 15 23:50:04 CEST 2011


On Sep 15, 2011, at 4:07 PM, Jon Zadra wrote:

> Hi,
> 
> I'm having some problems with the aggregate() function in the {stats} package, and the documentation doesn't address them.
> 
> 1) Why would the first line work, but the second not? According to the help file, it accepts a "data=" argument.
> 
>> with(tsrc, aggregate(x=DistRatio, by=list(Condition), FUN=mean))
>    Group.1        x
> 1 Congruent 1.741789
> 2  Mismatch 1.771425
> 
>> aggregate(x=DistRatio, by=list(Condition), data=tsrc, FUN=mean)
> Error in aggregate(x = DistRatio, by = list(Condition), data = tsrc, FUN = mean) :
>  object 'DistRatio' not found
> 
> 
> 2) The subset argument also does not appear to work (perhaps this is the same issue?):
> 
>> with(tsrc, aggregate(x=DistRatio, by=list(Condition), subset=Drop!="Yes", FUN=mean))
>    Group.1        x
> 1 Congruent 1.741789
> 2  Mismatch 1.771425
> 
>> with(tsrc[tsrc$Drop!="Yes",], aggregate(x=DistRatio, by=list(Condition), FUN=mean))
>    Group.1        x
> 1 Congruent 1.700215
> 2  Mismatch 1.859795
> 
> 
> So, am I doing something wrong or is this function just not working as advertised?
> 
> Thanks,
> 
> Jon



You are combining syntax from the data frame method for aggregate() with arguments (data and subset) that are only available in the formula method. Be sure to read the help page for the function and note at the top of the page, the different syntax for each method. Of course, note the examples as well.

If you want to use the formula method with 'data' and 'subset' you would need to use something like:

  aggregate(DistRatio ~ Condition, data = tsrc, FUN = mean)

or:

  aggregate(DistRatio ~ Condition, data = tsrc, subset = Drop != "Yes", FUN = mean)

HTH,

Marc Schwartz



More information about the R-help mailing list