[R] contingency tables

Milan Bouchet-Valat nalimilan at club.fr
Fri Mar 16 15:10:53 CET 2012


Le vendredi 16 mars 2012 à 06:46 -0700, mari681 a écrit :
> Ok, before I definetly give up, and throw the laptop out of the window, or
> fill my data.frame manually, I'll ask for some help.
> I have a data.frame named MyTable with 3 columns, that looks like this:
<snip>

> The elements on the first column can be: red-j, orange-j, yellow-j, green-j,
> blue-j or purple-j. Elements on column 2 are contexts in which the color
> appears. And in column 3 there is a kind of frequency measurement between
> color and context.
> 
> I am trying to build a new data.frame where the 6 colors in column 1 are
> placed as column names, all the unique contexts in column 2 are placed as
> row names and the values on column 3 are in the correspondent cells (or 0 if
> an intersection color-context is empty).
> Easy to prepare the empty data.frame, but I cant find out how to fill it up
> with the frequencies.
> Ideas? Anybody has a ready script for this?
If I understand correctly, this is very simple using xtabs().

dat <- structure(list(V1 = structure(c(5L, 5L, 5L, 5L, 3L, 3L, 3L, 3L, 
6L, 6L, 6L, 6L, 6L, 6L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("blue-j", 
"green-j", "orange-j", "purple-j", "red-j", "yellow-j"), class =
"factor"), 
    V2 = structure(c(1L, 4L, 10L, 12L, 8L, 27L, 29L, 31L, 11L, 
    18L, 21L, 24L, 25L, 26L, 14L, 15L, 16L, 23L, 17L, 19L, 2L, 
    3L, 7L, 9L, 13L, 20L, 22L, 5L, 6L, 33L, 28L, 29L, 30L, 32L
    ), .Label = c("appearanceblood-n", "appearancecypress-n", 
    "appearancefirmament-n", "appearanceground-n", "appearanceleave-n", 
    "appearancelesion-n", "appearanceman-n", "appearanceobject-n", 
    "appearancerange-n", "appearancesea-n", "appearanceskin-n", 
    "appearancesky-n", "appearancesurface-n", "appearancewater-n", 
    "architecturebuilding-n", "architecturecourse-n",
"areaauditorium-n", 
    "areacircle-n", "areacity-n", "areagraph-n", "areainfarction-n", 
    "arealibrary-n", "areaRio-n", "as_adj_ascorn-n", "as_adj_asflax-n", 
    "as_adj_asgold-n", "as_adj_aspainting-n", "coloramethyst-n", 
    "colorbanknote-n", "colorbottle-n", "colorcar-n",
"colorchocolate-n", 
    "colorViking-n"), class = "factor"), V3 = c(105.032, 93.749, 
    102.167, 10.898, 109.354, 93.248, 159.167, 117.985, 109.527, 
    87.064, 120.759, 219.739, 122.576, 90.814, 91.477, 103.582, 
    103.325, 106.614, 102.505, 150.005, 145.133, 148.655, 85.731, 
    90.706, 100.991, 92.708, 77.135, 119.423, 134.287, 145.516, 
    175.619, 158.045, 132.395, 141.833)), .Names = c("V1", "V2", 
"V3"), class = "data.frame")

xtabs(V3 ~ V2 + V1, data=dat)


BTW, creating the data frame took me the most time, so please provide a
working data set using dput() next time.

Regards



More information about the R-help mailing list