Factor structures not preserved after dump/dput (PR#200)

Peter Dalgaard BSA p.dalgaard@biostat.ku.dk
27 May 1999 19:06:37 +0200


msa@biostat.mgh.harvard.edu writes:

>   a <- factor(1:5,1:5,c('a','b','c','d','e'))
>   b <- a[3:5]
>   dput(b,'b.data')
>   new.b <- dget('b.data')
> 
> Then b is not the same as new.b:
>   > b
>   [1] c d e
>   Levels:  a b c d e 
>   > new.b
>   [1] a b c
>   Levels:  a b c d e 
> 
> This seems to be a very serious bug. It can make one to 
> mislabel treatments: a very emabarassing (and potentially 
> disastrous) mistake. The reason for this bug seems to
> lie in a way in which structure() treats factor structures.

Ouch! The fix seems to be to have structure() contain

        if (is.numeric(.Data) && any(names(attrib) == "levels")) 
            .Data <- factor(.Data, levels = 1:max(1,.Data))

rather than just plain  factor(.Data) .  Will commit this in a moment.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._