[R] Using the shape () function

Tom Backer Johnsen backer at psych.uib.no
Tue Jun 17 16:28:55 CEST 2008


In a research project we are using a web-based tools for collecting data 
from questionnaire.  The system generates files that are simple to read 
as a data frame in the "long" format, which are simple to convert to the 
  "wide" format.

Something that might happen are: (a) there are two (multiple) references 
to the same cell, and (b) if there are missing values?  So, the data set 
has two references to S2/T2 and none to the S2/T1 combination:

 > d
      values person time
   1       1     S1   T1
   2       2     S1   T2
   3       3     S1   T3
   4       4     S1   T4
   5      22     S2   T2
   6       6     S2   T2
   7       7     S2   T3
   8       8     S2   T4
   9       9     S3   T1
   10     10     S3   T2
   11     11     S3   T3
   12     12     S3   T4
reshape (d, idvar="person", v.names=c("values"), timevar="time", 
direction="wide")
    person values.T1 values.T2 values.T3 values.T4
  1     S1         1         2         3         4
  5     S2        NA        22         7         8
  9     S3         9        10        11        12

The missing cell gets an NA as expected.  But the surprise is in the 
case where there are two references to the same cell.  The the *first* 
is used (22 rather than 6).

Is there some way of forcing reshape () to use the *last* value?

Tom



More information about the R-help mailing list