[R] Group by in R

Mon Apr 13 15:56:39 CEST 2009

Gabor Grothendieck wrote:
> 
> SQL has the order by clause.
> 

Gabor, thanks for the suggestion. I thought about this but ORDER BY cannot
create the tabular structure that I need. Here is more detail about my
setting:

f1, f2, f3 have unique triplets (each repeating a different number of
times). Each of these triplets falls into one of the two categories of f4.
Here is a sample:

> data
   X1 X2 X3 X4
1   1  2  2  1
2   1  1  2  2
3   1  1  2  2
4   2  2  1  2
5   1  1  2  2
6   2  2  1  2
7   1  1  2  1
8   2  2  1  2
9   1  2  1  1
10  1  1  2  2

sqldf("select X1, X2, X3, X4, count(*) CNT from data group by X1, X2, X3, X4
ORDER BY X4, X1, X2, X3")

  X1 X2 X3 X4 CNT
1  1  1  2  1   1
2  1  2  1  1   1
3  1  2  2  1   1
4  1  1  2  2   4
5  2  2  1  2   3

The counts are fine, though it's not exactly what I need. I need a kind of
contingency table:

                                 | levels of X4 |
                                 ---------------
unique triplets of X1:X3 |  1   |   2   |

-----------------------------------------
            1 1 1             |  0       0
            1 1 2             |  1       4 
            1 2 1             |  1       0
            1 2 2             |  1       0
            2 1 1             |  0       0
            2 1 2             |  0       0
            2 2 1             |  0       3
            2 2 2             |  0       0

So the final result should be a table structure like:

0 0
1 4
1 0
1 0
0 0
0 0
0 3
0 0

I guess I could probably do this in SQL with a combination of OUTER JOINs
but I thought
that R might have a more elegant solution based on "factor" and "table".

Thanks,
Nick 
-- 
View this message in context: http://www.nabble.com/Group-by-in-R-tp23020587p23022717.html
Sent from the R help mailing list archive at Nabble.com.