[R] scatterplot and correlation for weird data format

hadley wickham h.wickham at gmail.com
Mon Feb 16 18:13:12 CET 2009


On Mon, Feb 16, 2009 at 10:21 AM, William Simpson
<william.a.simpson at gmail.com> wrote:
> I have data in a format like this:
>
> name    ssex    sex     view    num     rating  rt
> ahl4    f       m       f       56      -108    2246
> ahl4    f       m       f       74      85      1444
> ahl4    f       m       f       52      151     1595
> ahl4    f       m       f       85      1       1447
> ahl4    f       m       f       53      46      1716
> ahl4    f       m       f       37      145     1276
> ahl4    f       m       f       50      98      1465
> ahl4    f       m       f       51      -26     1322
> ahl4    f       m       f       38      -97     1790
> ahl4    f       m       f       14      -158    865
> ...
> ahl4    f       m       p       43      -136    1669
> ahl4    f       m       p       10      -59     808
> ahl4    f       m       p       67      -111    1279
> ahl4    f       m       p       85      -86     994
> ahl4    f       m       p       100     134     1337
> ahl4    f       m       p       76      56      665
> ahl4    f       m       p       51      -49     594
> ahl4    f       m       p       33      -118    505
> ahl4    f       m       p       49      -156    1283
> ...
> and so on for many subjects (name)
>
> I would like to do a scatterplot of the rating given by each subject
> (with identifier "name") for the frontal (view=="f") and profile
> (view=="p") views of each face (each face has an identifier "num").
> I'd like to find the correlation as well.
> For each subject, since there are 100 faces, there will be 100 points
> on the scatterplot. I would just lump all the subjects' data together
> for the plot and correlation I think (unless somebody tells me I
> should do each subject separately).

You might find the reshape package, http://had.co.nz/reshape, helpful.
You could do something like:

dfm <- melt(mydataframe, m = c("num", "rating", "rt"))
cast(dfm, ... ~ view, subset = variable == "rating")

Then do a scatterplot of the variables f and p.

Hadley

-- 
http://had.co.nz/




More information about the R-help mailing list