[R] Condition to factor (easy to remember)

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Sep 30 21:54:40 CEST 2009


Douglas Bates wrote:
> On Wed, Sep 30, 2009 at 2:42 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
>> On Wed, Sep 30, 2009 at 2:43 AM, Dieter Menne
>> <dieter.menne at menne-biomed.de> wrote:
>>
>>> Dear List,
>>> creating factors in a given non-default orders is notoriously difficult to
>>> explain in a course. Students love the ifelse construct given below most,
>>> but I remember some comment from Martin Mächler (?) that ifelse should be
>>> banned from courses.
>>> Any better idea? Not necessarily short, easy to remember is important.
>>> Dieter
>>> data = c(1,7,10,50,70)
>>> levs = c("Pre","Post")
>>>
>>> # Typical C-Programmer style
>>> factor(levs[as.integer(data >10)+1], levels=levs)
>>>
>>> # Easiest to understand
>>> factor(ifelse(data <=10, levs[1], levs[2]), levels=levs)
>> Why not
>>
>>> factor(data > 10, labels = c("Pre", "Post"))
>> [1] Pre  Pre  Pre  Post Post
>> Levels: Pre Post
>>
>> All you have to remember is that FALSE comes before TRUE.
> 
> And besides, Frank Harrell will soon be weighing in to tell you why
> you shouldn't dichotomize in the first place.

And someone might also remind you that it is safest to include 
levels=c(FALSE,TRUE), just in case the condition is always TRUE. (Terry 
Thernau has the scars from the implementation of Surv()...)

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907




More information about the R-help mailing list