[R] reshape is re-ordering my variables

Kevin E. Thorpe kevin.thorpe at utoronto.ca
Tue Sep 21 21:01:11 CEST 2010


Is it an undocumented (at least I missed it if it's documented) feature
of the reshape function to do numeric variables followed by character?
I ask because that seems to be the case below.

 > str(rcw)
'data.frame':	23 obs. of  21 variables:
  $ ICU              : int  1 18 17 9 22 19 6 16 25 26 ...
  $ Q6.RC.1          : chr  "SM" "JF" "IW" "MS" ...
  $ Q6.FT.RC.1.years : int  0 8 12 3 9 1 5 16 5 5 ...
  $ Q6.FT.RC.1.months: int  0 0 0 0 0 0 0 6 0 0 ...
  $ Q6.PT.RC.1.years : int  2 0 0 1 2 0 0 0 0 0 ...
  $ Q6.PT.RC.1.months: int  0 0 0 0 0 0 0 0 0 0 ...
  $ Q6.RC.2          : chr  "BA" "ML" "TM" "YL" ...
  $ Q6.FT.RC.2.years : int  0 0 7 3 0 99999 0 0 0 0 ...
  $ Q6.FT.RC.2.months: int  0 0 0 0 0 99999 0 0 0 0 ...
  $ Q6.PT.RC.2.years : int  2 10 2 0 0 99999 0 5 0 0 ...
  $ Q6.PT.RC.2.months: int  0 0 0 0 8 99999 1 0 6 6 ...
  $ Q6.RC.3          : chr  "LL" "TM" "99999" "99999" ...
  $ Q6.FT.RC.3.years : int  6 0 99999 99999 99999 99999 0 99999 0 0 ...
  $ Q6.FT.RC.3.months: int  0 0 99999 99999 99999 99999 0 99999 0 0 ...
  $ Q6.PT.RC.3.years : int  0 8 99999 99999 99999 99999 0 99999 0 0 ...
  $ Q6.PT.RC.3.months: int  0 0 99999 99999 99999 99999 1 99999 4 4 ...
  $ Q6.RC.4          : chr  "99999" "IW" "99999" "99999" ...
  $ Q6.FT.RC.4.years : int  99999 0 99999 99999 99999 99999 99999 99999 
99999 99999 ...
  $ Q6.FT.RC.4.months: int  99999 0 99999 99999 99999 99999 99999 99999 
99999 99999 ...
  $ Q6.PT.RC.4.years : int  99999 12 99999 99999 99999 99999 99999 99999 
99999 99999 ...
  $ Q6.PT.RC.4.months: int  99999 0 99999 99999 99999 99999 99999 99999 
99999 99999 ...

This data frame needs to be converted to long format with 5 variables 
repeating over 4 observations.

 > rcl <- 
reshape(rcw,idvar="ICU",varying=2:21,direction="long",v.names=c("init","FTy","FTm","PTy","PTm"))

 > str(rcl)
'data.frame':	92 obs. of  7 variables:
  $ ICU : int  1 18 17 9 22 19 6 16 25 26 ...
  $ time: int  1 1 1 1 1 1 1 1 1 1 ...
  $ init: int  0 0 0 0 0 0 0 6 0 0 ...
  $ FTy : int  0 8 12 3 9 1 5 16 5 5 ...
  $ FTm : int  0 0 0 0 0 0 0 0 0 0 ...
  $ PTy : int  2 0 0 1 2 0 0 0 0 0 ...
  $ PTm : chr  "SM" "JF" "IW" "MS" ...
  - attr(*, "reshapeLong")=List of 4
   ..$ varying:List of 5
   .. ..$ FTm : chr  "Q6.FT.RC.1.months" "Q6.FT.RC.2.months" 
"Q6.FT.RC.3.months" "Q6.FT.RC.4.months"
   .. ..$ FTy : chr  "Q6.FT.RC.1.years" "Q6.FT.RC.2.years" 
"Q6.FT.RC.3.years" "Q6.FT.RC.4.years"
   .. ..$ PTm : chr  "Q6.PT.RC.1.months" "Q6.PT.RC.2.months" 
"Q6.PT.RC.3.months" "Q6.PT.RC.4.months"
   .. ..$ PTy : chr  "Q6.PT.RC.1.years" "Q6.PT.RC.2.years" 
"Q6.PT.RC.3.years" "Q6.PT.RC.4.years"
   .. ..$ init: chr  "Q6.RC.1" "Q6.RC.2" "Q6.RC.3" "Q6.RC.4"
   .. ..- attr(*, "v.names")= chr  "init" "FTy" "FTm" "PTy" ...
   .. ..- attr(*, "times")= int  1 2 3 4
   ..$ v.names: chr  "init" "FTy" "FTm" "PTy" ...
   ..$ idvar  : chr "ICU"
   ..$ timevar: chr "time"

In the result, the values in the first of the varying variables goes
into the last variable while the other values are shifted left.  The
attributes in the result are correct, but the contents of rcl$PTm are
what I expected in rcl$init.

 > sessionInfo()
R version 2.11.1 Patched (2010-07-21 r52598)
Platform: i686-pc-linux-gnu (32-bit)

locale:
  [1] LC_CTYPE=en_US       LC_NUMERIC=C         LC_TIME=en_US
  [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=en_US
  [7] LC_PAPER=en_US       LC_NAME=C            LC_ADDRESS=C
[10] LC_TELEPHONE=C       LC_MEASUREMENT=en_US LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.11.1

-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.thorpe at utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016



More information about the R-help mailing list