[R] Summarizing For Values with Multiple categories

jim holtman jholtman at gmail.com
Sun Oct 24 01:58:59 CEST 2010


Here is another way of doing it using some of the functions in a
step-by-step manner:

> # had to put some separators in since data format was not apparent
> # best to provide sample data with 'dput'
> x <- read.table(textConnection("Cat1|Cat2 |Cat3 | COG |Counts
+   A  |  B |   C |COG1 |    10
+   B  |  D  ||    COG2   |  20
+   C   |||        COG3  |   30
+   D   |||        COG4   |  40")
+     , header = TRUE
+     , as.is = TRUE
+     , strip.white = TRUE
+     , sep = "|"
+     )
> closeAllConnections()
> x
  Cat1 Cat2 Cat3  COG Counts
1    A    B    C COG1     10
2    B    D      COG2     20
3    C           COG3     30
4    D           COG4     40
> # pull out the data into a 'long' format based on the first 3 columns
> # iterate over the first three columns combining with "Counts"
> long <- do.call(rbind, lapply(x[1:3], function(.col){
+     cbind(.col, x[['Counts']])
+ }))
>
> # remove blanks
> long <- long[long[,1] != "", ]
>
> # now aggregate converting the character 'counts' to numeric
> tapply(as.numeric(long[,2]), long[,1], sum)
 A  B  C  D
10 30 40 60
>


On Sat, Oct 23, 2010 at 7:03 PM, Alison Waller <alison.waller at embl.de> wrote:
> Thanks!
>
> I tried reading the help for aggregate and can't figure out which form of
> the formula I am using, and therefore the syntax.
>
> I'm getting the below error.
>
>> aggregate(counts ~ ind, merge(stack(CAT2COG), df, by = 1), sum)
> Error in as.data.frame.default(x) :
>  cannot coerce class "formula" into a data.frame
>> aggregate(counts ~ Cats, merge(stack(CAT2COG), df, by = 1), sum)
> Error in as.data.frame.default(x) :
>  cannot coerce class "formula" into a data.frame
>> Cats
> [1] A B C D E
> Levels: A B C D E
>> aggregate(counts ~ COGs, merge(stack(CAT2COG), df, by = 1), sum)
> Error in as.data.frame.default(x) :
>  cannot coerce class "formula" into a data.frame
> On 24-Oct-10, at 12:50 AM, Gabor Grothendieck wrote:
>
>>> aggregate(counts ~ ind, merge(stack(CAT2COG), df, by = 1), sum)
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list