musues of == (was [R] length() misbehaving?)

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Fri Mar 14 16:58:50 CET 2003


It's the users who are misbehaving -- it usually is!

I think you mean [byyr$cnd95 %in% "tr"], not the same thing as R has
NA character strings.

> x <- c("a", "a", NA, "b2")
> x == "a"
[1]  TRUE  TRUE    NA FALSE
> x[x == "a"]
[1] "a" "a" NA 
> x[x %in% "a"]
[1] "a" "a"

MASS4 page 30 discusses this and similar traps.

On Fri, 14 Mar 2003, David Parkhurst wrote:

> I'm having a weird problem with length(), in R1.6.1 under windows2000.  I have a
> dataframe called byyr, with ten columns, the first of which is named cnd95.
> summary(byyr) shows that byyr$cnd95 contains the factor level "tr" 66 times.  Also,
> when I enter byyr$cnd95 at the command line, I can count 66 "tr" elements in the
> resulting vector.  However, when I enter
> 
> n95trt <- length(byyr$cnd95[byyr$cnd95=="tr"])
> n95trt
> 
> the result is 68!  Any ideas why this is happening, and how I can fix the miscount?
> (That column also contains 69 entries of "c", and (relevantly?) two NA's.)
> 
> Thanks for any help.
> 
> Dave Parkhurst
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list