[R] Bug in levels() function?

Groot, Philip de philip.degroot at wur.nl
Tue Jan 29 09:02:44 CET 2008


Hello all,
 
Thank you for all the responses. It is clear to me now. However, if the "drop" possibility was also mentioned in the help text (in R, so "?levels") I wouldn't have asked this question at all!
 
Regards,
 
Philip

________________________________

From: Thomas Lumley [mailto:tlumley at u.washington.edu]
Sent: Mon 28-1-2008 20:03
To: Groot, Philip de
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Bug in levels() function?




This is not a bug; it is deliberately designed this way.

There are circumstances when you want to drop levels on subsetting and
other circumstances where you don't, so the default behaviour can't make
everyone happy.  However, there is an option to get the behaviour you want
> x<-as.factor(LETTERS)
> levels(x[1])
  [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q"
"R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"
> levels(x[1,drop=TRUE])
[1] "A"


On Mon, 28 Jan 2008, Groot, Philip de wrote:

> Hello all,
>
> I am not sure whether it actually is a bug, but it is not the behaviour I would expect. Please consider this:
>
>> Sibships
> [1] Patient_2400 Patient_2400 Patient_345  Patient_345  Patient_8901
> [6] Patient_8901 Patient_4008 Patient_4008 Patient_7991 Patient_7991
> [11] Patient_8353 Patient_8353 Patient_1212 Patient_1212 Patient_2168
> [16] Patient_2168 Patient_2760 Patient_2760 Patient_4726 Patient_4726
> [21] Patient_6699 Patient_6699 Patient_7641 Patient_7641 Patient_8263
> [26] Patient_8263 Patient_1389 Patient_1389 Patient_1618 Patient_1618
> [31] Patient_2410 Patient_2410 Patient_2612 Patient_2612 Patient_2721
> [36] Patient_2721 Patient_5053 Patient_5053 Patient_8458 Patient_8458
> [41] Patient_211  Patient_211  Patient_9004 Patient_9004 Patient_3423
> [46] Patient_3423 Patient_7413 Patient_7413 Patient_7815 Patient_7815
> [51] Patient_9232 Patient_9232 Patient_2267 Patient_2267 Patient_468
> [56] Patient_468
> 28 Levels: Patient_1212 Patient_1389 Patient_1618 Patient_211 ... Patient_9232
>
>> Comparison_Indices
> [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
> [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
>> Sibships[Comparison_Indices]
> [1] Patient_2400 Patient_2400 Patient_345  Patient_345  Patient_8901
> [6] Patient_8901 Patient_7413 Patient_7413
> 28 Levels: Patient_1212 Patient_1389 Patient_1618 Patient_211 ... Patient_9232
>
> The problem with this last command is that I would expect 4 levels (because only 8 "Comparison_Indices" are true, which is equal to 4 sibships. So: levels() does not take array indices into account or stated otherwise: if you use a subset in an array (vector), the levels() are not properly updated (to my opinion).
>
> What I additionally found is the following:
>> small_test <- factor(x=c("a", "b", "c"))
>> typeof(small_test)
> [1] "integer"
>
> The same happens to the Sibships that I defined as a factor? Why is it of type integer?
>
> This is the version() output:
>> version
>               _
> platform       x86_64-unknown-linux-gnu
> arch           x86_64
> os             linux-gnu
> system         x86_64, linux-gnu
> status
> major          2
> minor          6.1
> year           2007
> month          11
> day            26
> svn rev        43537
> language       R
> version.string R version 2.6.1 (2007-11-26)
>>
>
> So: should I submit a Bug report?
>
> Regards,
>
> Dr. Philip de Groot
> Wageningen University
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Thomas Lumley                   Assoc. Professor, Biostatistics
tlumley at u.washington.edu        University of Washington, Seattle



More information about the R-help mailing list