[R] Group by in R

Paul Johnson pauljohn32 at gmail.com
Wed Apr 15 06:50:27 CEST 2009


On Mon, Apr 13, 2009 at 8:56 AM, Nick Angelou <nikolay12 at yahoo.com> wrote:
>
>> data
>   X1 X2 X3 X4
> 1   1  2  2  1
> 2   1  1  2  2
> 3   1  1  2  2
> 4   2  2  1  2
> 5   1  1  2  2
> 6   2  2  1  2
> 7   1  1  2  1
> 8   2  2  1  2
> 9   1  2  1  1
> 10  1  1  2  2
>
> sqldf("select X1, X2, X3, X4, count(*) CNT from data group by X1, X2, X3, X4
> ORDER BY X4, X1, X2, X3")
>
>  X1 X2 X3 X4 CNT
> 1  1  1  2  1   1
> 2  1  2  1  1   1
> 3  1  2  2  1   1
> 4  1  1  2  2   4
> 5  2  2  1  2   3
>
> The counts are fine, though it's not exactly what I need. I need a kind of
> contingency table:
>
>                                 | levels of X4 |
>                                 ---------------
> unique triplets of X1:X3 |  1   |   2   |
>
> -----------------------------------------
>            1 1 1             |  0       0
>            1 1 2             |  1       4
>            1 2 1             |  1       0
>            1 2 2             |  1       0
>            2 1 1             |  0       0
>            2 1 2             |  0       0
>            2 2 1             |  0       3
>            2 2 2             |  0       0
>
>
> So the final result should be a table structure like:
>
>
> 0 0
> 1 4
> 1 0
> 1 0
> 0 0
> 0 0
> 0 3
> 0 0
>

I propose this way to get the numbers you want. I create a new
variable to represent the values of the three then make a table:



md <- matrix(c(1,2,2,1,1,1,2,2,1,1,2,2,2,2,1,2,1,1,2,2,2,2,1,2,1,1,2,1,2,2,1,2,1,2,1,1,1,1,2,2),ncol=4)

dat <- as.data.frame(md)
names(dat)<- c("x1","x2","x3","x4")

newvar <- factor(paste(dat$x1,dat$x2,dat$x3,sep="-"))

table(newvar, dat$x4)


Behold:

> table(newvar, dat$x4)

newvar  1 2
  1-1-1 1 0
  1-2-1 1 0
  1-2-2 1 3
  2-1-1 1 0
  2-1-2 1 0
  2-2-1 1 0
  2-2-2 0 1



-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas




More information about the R-help mailing list