[R] Summarizing For Values with Multiple categories

Gabor Grothendieck ggrothendieck at gmail.com
Sun Oct 24 00:50:05 CEST 2010


On Sat, Oct 23, 2010 at 6:15 PM, Alison Waller <alison.waller at embl.de> wrote:
> Hi all,
>
> I have some data as follows.
>
> Cat1 Cat2 Cat3  COG Counts
>   A    B    C COG1     10
>   B    D      COG2     20
>   C           COG3     30
>   D           COG4     40
>
> I would like to sum all the counts for each category:
> A       B       C       D
> 10      30      40      60
>
>>CAT2COG<-list(A="COG1",B=c("COG1","COG2"),C=c("COG1","COG3"),D=c("COG2","COG4"))
>> COG2CAT<-list(COG1=c("A","B","C"),COG2=c("B","D"),COG3=c("C"),COG4="D")
>> df<-data.frame(COGs=c("COG1","COG2","COG3","COG4"),counts=c(10,20,30,40))
>

Try this:

> aggregate(counts ~ ind, merge(stack(CAT2COG), df, by = 1), sum)
  ind counts
1   A     10
2   B     30
3   C     40
4   D     60

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list