[R] trouble with summary tables with several variables using aggregate function

Dennis Murphy djmuser at gmail.com
Thu May 19 21:33:30 CEST 2011


Oops, didn't see Marc's reply. His solution is much more compact. For
R 2.11.0 and above, aggregate() now has a formula interface that
usually works nicely:

aggregate(Var3 ~ Var1 + Var2, data = d, FUN = table)
  Var1 Var2 Var3.D Var3.I
1   S1   T1      2      2
2   S2   T1      2      2
3   S1   T2      2      2
4   S2   T2      0      4

Dennis

On Thu, May 19, 2011 at 12:29 PM, Dennis Murphy <djmuser at gmail.com> wrote:
> Hi:
>
> The dummy column really isn't necessary. Here's another way to get the
> result you want. Let d be the name of your example data frame.
>
> d <- d[, 1:3]
> (dtable <- as.data.frame(ftable(d, row.vars = c(1, 2))))
>  Var1 Var2 Var3 Freq
> 1   S1   T1    D    2
> 2   S2   T1    D    2
> 3   S1   T2    D    2
> 4   S2   T2    D    0
> 5   S1   T1    I    2
> 6   S2   T1    I    2
> 7   S1   T2    I    2
> 8   S2   T2    I    4
>
> An alternative to the reshape() function is the reshape2 package,
> which has a function dcast() that allows you to rearrange the data
> frame as you desire.
>
> library(reshape2)
> dcast(dtable, Var1 + Var2 ~ Var3)
> Using Freq as value column: use value_var to override.
>  Var1 Var2 D I
> 1   S1   T1 2 2
> 2   S1   T2 2 2
> 3   S2   T1 2 2
> 4   S2   T2 0 4
>
>
> HTH,
> Dennis
>
> On Thu, May 19, 2011 at 2:13 AM, Luma R <rluma1979 at gmail.com> wrote:
>> Dear all,
>>
>> I am having trouble creating summary tables using aggregate function.
>>
>> given the following table:
>>
>>
>> Var1   Var2    Var3   dummy
>> S1       T1         I         1
>> S1       T1         I         1
>> S1       T1         D        1
>> S1       T1         D        1
>> S1       T2         I         1
>> S1       T2         I         1
>> S1       T2         D        1
>> S1       T2         D        1
>> S2       T1         I         1
>> S2       T1         I         1
>> S2       T1         D        1
>> S2       T1         D        1
>> S2       T2         I         1
>> S2       T2         I         1
>> S2       T2         I        1
>> S2       T2         I        1
>>
>>
>> I want to create a summary table that shows for each category of Var1,
>> Var2, the number of cells that are Var3=D and Var3-I :
>>
>>         Var1 Var2  Var3(D)   Var3(I)
>>         S1     T1    2              2
>>         S1     T2    2              2
>>         S2     T1    2              2
>>         S2     T2    0              4
>>
>>
>>
>> However, if I do: Count.Cells=  aggregate(dummy~ Var1+Var2+Var3, FUN='sum')
>> , I get:
>>
>>           Var1 Var2  Var3 Count of Resp
>>            S1     T1     D        2
>>            S1     T1     I          2
>>            S1     T2     D        2
>>            S1     T2     I         2
>>            S2     T1     D       2
>>            S2      T1    I        2
>>            S2     T2     I        4
>>
>>
>> Is there a way to get different columns for each Var3 level?
>>
>>
>> Thank you for any help you can give!
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



More information about the R-help mailing list