[R] Odp: aggregate function oddity

Gustaf Rydevik gustaf.rydevik at gmail.com
Tue Sep 18 11:12:38 CEST 2007


On 9/18/07, Mihalicza Péter <mihalicza.peter at eski.hu> wrote:
> Sorry for the confusion, I was not clear enough, so I made a small
> example to illustrate:
>
>  >m=data.frame(fac1=rep(c(1,2),3), fac2=c("a","b","b","b","a","b"),
> num1=1:6, num2=7:12)
>  > m$fac1=as.factor(m$fac1)
>  > m
>   fac1 fac2 num1 num2
> 1    1    a    1    7
> 2    2    b    2    8
> 3    1    b    3    9
> 4    2    b    4   10
> 5    1    a    5   11
> 6    2    b    6   12
>  >#I would like to get the sum of num1 and num2 grouped by c(1,2) and c(a,b)
>  > ag=aggregate(m, list(m$fac1, m$fac2), sum)
> Error in Summary.factor(..., na.rm = na.rm) :
>         sum not meaningful for factors
>
>  >#I understand, that it is possible to do...
>
>  >ag=aggregate(m[,3:4], list(m$fac1, m$fac2), sum)
>  > ag
>   Group.1 Group.2 num1 num2
> 1       1       a    6   18
> 2       1       b    3    9
> 3       2       b   12   30
>
> but I do not understand why aggragate tries to sum fac1 and fac2 since
> they are grouping variables that need not, and must not be summed. To my
> understanding the aggregate help text also does not speak about omitting
> factor variables from the data frame.
>
> My question is whether I miss something, or this is how aggregate works.
> If the latter, than what is the reason for it.
>
> Thanks, and sorry again!
>
> Yours,
> Peter Mihalicza
>
>

Aggregate does not assume that the grouping variables and the object
to aggregate are related to each other.
Thus, supply *exactly* the object which you want to aggregate over, in
your case m[,c("num1","num2")], as X. The reason for this I leave to
others to explain.


/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik



More information about the R-help mailing list