[R] Group by in R

Gabor Grothendieck ggrothendieck at gmail.com
Mon Apr 13 17:51:17 CEST 2009


Assuming DF is your data frame try this: ftable(DF)

In SQL you can get close with:

sqldf("select X1, X2, X3, sum(X4 == 1) `X4=1`, sum(X4 == 2) `X4=2`
from DF group by X1, X2, X3 order by X1, X2, X3")

On Mon, Apr 13, 2009 at 9:56 AM, Nick Angelou <nikolay12 at yahoo.com> wrote:
>
>
> Gabor Grothendieck wrote:
>>
>> SQL has the order by clause.
>>
>
> Gabor, thanks for the suggestion. I thought about this but ORDER BY cannot
> create the tabular structure that I need. Here is more detail about my
> setting:
>
> f1, f2, f3 have unique triplets (each repeating a different number of
> times). Each of these triplets falls into one of the two categories of f4.
> Here is a sample:
>
>> data
>   X1 X2 X3 X4
> 1   1  2  2  1
> 2   1  1  2  2
> 3   1  1  2  2
> 4   2  2  1  2
> 5   1  1  2  2
> 6   2  2  1  2
> 7   1  1  2  1
> 8   2  2  1  2
> 9   1  2  1  1
> 10  1  1  2  2
>
> sqldf("select X1, X2, X3, X4, count(*) CNT from data group by X1, X2, X3, X4
> ORDER BY X4, X1, X2, X3")
>
>  X1 X2 X3 X4 CNT
> 1  1  1  2  1   1
> 2  1  2  1  1   1
> 3  1  2  2  1   1
> 4  1  1  2  2   4
> 5  2  2  1  2   3
>
> The counts are fine, though it's not exactly what I need. I need a kind of
> contingency table:
>
>                                 | levels of X4 |
>                                 ---------------
> unique triplets of X1:X3 |  1   |   2   |
>
> -----------------------------------------
>            1 1 1             |  0       0
>            1 1 2             |  1       4
>            1 2 1             |  1       0
>            1 2 2             |  1       0
>            2 1 1             |  0       0
>            2 1 2             |  0       0
>            2 2 1             |  0       3
>            2 2 2             |  0       0
>
>
> So the final result should be a table structure like:
>
>
> 0 0
> 1 4
> 1 0
> 1 0
> 0 0
> 0 0
> 0 3
> 0 0
>
> I guess I could probably do this in SQL with a combination of OUTER JOINs
> but I thought
> that R might have a more elegant solution based on "factor" and "table".
>
> Thanks,
> Nick
> --
> View this message in context: http://www.nabble.com/Group-by-in-R-tp23020587p23022717.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list