[R] Strange behavior when subsetting data frames with NAs

Prof Brian D Ripley ripley at stats.ox.ac.uk
Thu Mar 7 08:22:02 CET 2002


On Wed, 6 Mar 2002 apjaworski at mmm.com wrote:

> Here is what I get using R 1.4.1 on Win2k (using precompiled version from
> CRAN) and RH 7.2 Linux (compiled form source):
>
>      >  data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz
>      > zz[zz[,2]>2, ]
>              a  b
>      X1   1  3
>      X3   3  3
>      NA  NA NA
>      NA1 NA NA                                        (if there are more
> rows with NAs, I get consecutive labels NA2, NA3, ...)
>      > zz1 <- na.omit(zz)
>      > zz1[zz1[,2]>2, ]
>         a b
>      1 1 3
>      3 3 3
>
> also
>
>      > as.matrix(zz) -> zz
>      > zz[zz[,2]>2, ]
>          a  b
>      1   1  3
>      3   3  3
>      NA NA NA
>      NA NA NA
>
> I am not sure if this is bug or a feature, so I am reporting it here.

What exactly do you find strange?  It is the correct behaviour and
replicates that of S.  Remember that data frames have to have unique row
names, and you asked for rows

> zz[,2]>2
[1]  TRUE FALSE  TRUE    NA    NA

so new row names have to be created.  Matrices do not have to have unique
dimnames.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list