[R] Arrange two columns into a five variable dataframe

Brian Diggs diggsb at ohsu.edu
Sat Jul 14 06:19:09 CEST 2012


On 7/13/2012 8:37 PM, darnold wrote:
> Hi,
>
> I hope that folks can give me some simple approaches to taking the data set
> below, which is accumulated in two columns called "long" and "group", then
> arrange the data is the "long" column into a data frame containing five
> variables: "Group 1", "Group 2", "Group 3", "Group 4", and "Group 5".  I am
> hoping for a few different techniques which I can pass on to my students.
>
> Thanks
>
> David Arnold
> College of the Redwoods
>
>
>> dput(flies)
> structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L,
> 54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L,
> 72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L,
> 80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L,
> 90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L,
> 60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L,
> 48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L,
> 65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L,
> 77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L,
> 40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L
> ), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L), .Label = c("Group 5", "Group 4", "Group 3", "Group 2",
> "Group 1"), class = "factor")), .Names = c("long", "group"), row.names =
> c(NA,
> -125L), class = "data.frame")

Generally I would recommend either the reshape function or the functions 
in the reshape2 package. However, your data doesn't quite have what is 
needed to use those. You are implicitly assuming that the first 
occurring values in each group go together (should be in the same row), 
the second ones, etc.  The reshapes require an explicit indication of 
which variables go together.

The unstack function will work for you and uses the same assumption.

 > unstack(flies)
    Group.5 Group.4 Group.3 Group.2 Group.1
1       16      35      21      46      40
2       19      37      40      42      37
3       19      49      44      65      44
4       32      46      54      46      47
5       33      63      36      58      47
6       33      39      40      42      47
7       30      46      56      48      68
8       42      56      60      58      47
9       42      63      48      50      54
10      33      65      53      80      61
11      26      56      60      63      71
12      30      65      60      65      75
13      40      70      65      70      89
14      54      63      68      70      58
15      34      65      60      72      59
16      34      70      81      97      62
17      47      77      81      46      79
18      47      81      48      56      96
19      42      86      48      70      58
20      47      70      56      70      62
21      54      70      68      72      70
22      54      77      75      76      72
23      56      77      81      90      74
24      60      81      48      76      96
25      44      77      68      92      75



-- 
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University



More information about the R-help mailing list