[R] reshape question

Wed Nov 25 04:06:31 CET 2009

What about the melt function in reshape package?

EX:

> x=sample(1:100,20,replace=T)

> x
 [1] 48 94 32 96 81 99 10 64 64 94 57 60 16 64 32 76 63  1 64  8

> y=sample(1:100,20,replace=T)

> y
 [1] 73 78 82 43 58 85 74 64 73 41 45 38 63 36 44 74  7 88 91  1

> xy=cbind(x,y)

> melt(xy)
   X1 X2 value
1   1  x    48
2   2  x    94
3   3  x    32
4   4  x    96
5   5  x    81
6   6  x    99
7   7  x    10
8   8  x    64
9   9  x    64
10 10  x    94
11 11  x    57
12 12  x    60
13 13  x    16
14 14  x    64
15 15  x    32
16 16  x    76
17 17  x    63
18 18  x     1
19 19  x    64
20 20  x     8
21  1  y    73
22  2  y    78
23  3  y    82
24  4  y    43
25  5  y    58
26  6  y    85
27  7  y    74
28  8  y    64
29  9  y    73
30 10  y    41
31 11  y    45
32 12  y    38
33 13  y    63
34 14  y    36
35 15  y    44
36 16  y    74
37 17  y     7
38 18  y    88
39 19  y    91
40 20  y     1

>
Joe King
206-913-2912
jp at joepking.com
"Never throughout history has a man who lived a life of ease left a name
worth remembering." --Theodore Roosevelt

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of David Winsemius
Sent: Tuesday, November 24, 2009 6:43 PM
To: AC Del Re
Cc: r-help at r-project.org
Subject: Re: [R] reshape question

On Nov 24, 2009, at 8:33 PM, AC Del Re wrote:

> Hi All,
>
> I am wanting to convert a data.frame from a wide format to a long  
> format
> (with >1 variable) and am having difficulties. Any help is  
> appreciated!
>
> #current wide format
>> head(data.out2)
>     id   rater.1 n.1   rater.2 n.2   rater.3 n.3   rater.4 n.4
> 11   11 0.1183333  79        NA  NA        NA  NA        NA  NA
> 114 114 0.2478709 113        NA  NA        NA  NA        NA  NA
> 12   12 0.3130655  54 0.3668242  54        NA  NA        NA  NA
> 121 121 0.2400000 331        NA  NA        NA  NA        NA  NA
> 122 122 0.3004164  25 0.1046278  25 0.2424871  25 0.2796937  25
> 125 125 0.1634865 190        NA  NA        NA  NA        NA  NA
>
> #This is close but I would like the 'n' column to remain and for the  
> '.1' to
> drop off

I don't really understand what you want and the example solution  
throws away quite a lot of data, so consider this alternative:

data.out2 <- read.table(textConnection("id   rater.1 n.1   rater.2 n. 
2   rater.3 n.3   rater.4 n.4
11   11 0.1183333  79        NA  NA        NA  NA        NA  NA
114 114 0.2478709 113        NA  NA        NA  NA        NA  NA
12   12 0.3130655  54 0.3668242  54        NA  NA        NA  NA
121 121 0.2400000 331        NA  NA        NA  NA        NA  NA
122 122 0.3004164  25 0.1046278  25 0.2424871  25 0.2796937  25
125 125 0.1634865 190        NA  NA        NA  NA        NA  NA"),  
header=T, stringsAsFactors=F)

data.frame(id= data.out2$id, rater=stack(data.out2[,grep("rater",  
names(data.out2))]),
   n= stack(data.out2[,grep("n", names(data.out2))]) )

    data.out2.id rater.values rater.ind n.values n.ind
1            11    0.1183333   rater.1       79   n.1
2           114    0.2478709   rater.1      113   n.1
3            12    0.3130655   rater.1       54   n.1
4           121    0.2400000   rater.1      331   n.1
5           122    0.3004164   rater.1       25   n.1
6           125    0.1634865   rater.1      190   n.1
7            11           NA   rater.2       NA   n.2
8           114           NA   rater.2       NA   n.2
9            12    0.3668242   rater.2       54   n.2
10          121           NA   rater.2       NA   n.2
11          122    0.1046278   rater.2       25   n.2
12          125           NA   rater.2       NA   n.2
13           11           NA   rater.3       NA   n.3
14          114           NA   rater.3       NA   n.3
15           12           NA   rater.3       NA   n.3
16          121           NA   rater.3       NA   n.3
17          122    0.2424871   rater.3       25   n.3
18          125           NA   rater.3       NA   n.3
19           11           NA   rater.4       NA   n.4
20          114           NA   rater.4       NA   n.4
21           12           NA   rater.4       NA   n.4
22          121           NA   rater.4       NA   n.4
23          122    0.2796937   rater.4       25   n.4
24          125           NA   rater.4       NA   n.4

You can take what you like from what I would consider a version that  
has no loss of the original information.

>
>> data.out3<-reshape(data.out2,varying=list(names(data.out2)[-1]),
> +  idvar='id',direction='long')
>> head(data.out3)
>       id time   rater.1
> 11.1   11    1 0.1183333
> 114.1 114    1 0.2478709
> 12.1   12    1 0.3130655
> 121.1 121    1 0.2400000
> 122.1 122    1 0.3004164
> 125.1 125    1 0.1634865
>
> Ideally I would like the columns to be set up in this manner:
>
> id    time    rater     n

What is "time"?

>
>
> Thanks,
> html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.