[Rd] Incorrect behavior for ordering timepoints in "reshape" (PR#7669)

Peter Dalgaard p.dalgaard at biostat.ku.dk
Tue Feb 8 00:38:28 CET 2005


davclark at nyu.edu writes:

> Full_Name: Dav Clark
> Version: 2.0.1
> OS: OS X 10.3
> Submission from: (NULL) (128.122.87.35)
> 
> 
> When the timepoints that reshape uses (in direction="long") are negative or
> fractional, the time label is assigned incorrectly.  It is easier to give an
> example than to describe the problem abstractly:
> 
> Assume you have a data.frame header with values related to peri-stimulus time
> like this:
> 
> "HRF -5" "HRF -2.5" "HRF 0" "HRF 2.5" ... "HRF 10"
> 
> And you give reshape a split argument of a space " ".
> 
> Then the labels will be assigned strangely, based on alphabetical ordering.  So
> the above list order maps to:
> 
> -2.5, -5, 0, 10, ... 2.5
> 
> Items under the "HRF -5" column in wide format recieve a -2.5 label, items under
> "HRF 2.5" receive a label of 10, and so on.
> 
> Somewhere, the time labels are being used before conversion to numbers.  But,
> reshape returns an error if it is not possible to convert the timepoints to
> numeric!  So obviously, more functionality could be provided, or at least the
> documentation should reflect the current shortfall.
> 
> For completeness, here is a minimal example demonstrating the bug:
> 
> df <- data.frame(id="S1", V1="from -2", V2="from -1")
> names(df)[2:3] <- c("vals.-2", "vals.-1")
> df
> reshape(df, direction="long", varying=2:3)

Hmm, this looks messed up even without the negatives. The guess()
function inside reshape always sorts before converting to numeric, so
you get the 1 10 11 2 3 4 5 6 7 8 9 effect, but what is worse: the
sorting decouples the values from the variable names, as demonstrated
by modifying your example slightly

> reshape(df, direction="long", varying=3:2)
      id time    vals
S1.-1 S1   -1 from -1
S1.-2 S1   -2 from -2

I'm not at all sure I understand what was supposed to happen here,
perhaps the sort in

    varying <- unique(nn[, 1])
    times <- sort(unique(nn[, 2]))

is a thinko? Over to Thomas, I think.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907



More information about the R-devel mailing list