[Rd] subset and missing value indexing

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Aug 17 20:13:59 CEST 2004


On Tue, 17 Aug 2004, Adaikalavan Ramasamy wrote:

> Out of curiosity, is this a bug or a feature or "==" ?

A documented feature.
 
> m <- matrix( 1:12, nc=4 )
> f <- c("A", NA, "B", "A")
> 
> f == "A"
> [1]  TRUE    NA FALSE  TRUE
> 
> m[ , f == "A"]         # equivalent to m[ , c(1, NA, 4) ]
>      [,1] [,2] [,3]
> [1,]    1   NA   10
> [2,]    2   NA   11
> [3,]    3   NA   12
> 
> m[ , which(f == "A")]
>      [,1] [,2]
> [1,]    1   10
> [2,]    2   11
> [3,]    3   12
> 
> 
> In arguments section of help("which") it says that 
>    'NA's are allowed and omitted (treated as if 'FALSE').
> 
> After some thinking, I think this might be due to subsetting using index
> that includes missing value. help("[") appears not to say what happens
> when one of the indexing element is a missing value, only that the index
> can be logical ( and NA is logical ).
>
> Is there any reason for allowing NA when subsetting ?

Yes: it is part of the S language and widely used to avoid special cases
when programming.  Since you don't know what the index value is, the
column (in your case) is included or not and the only thing to do is to 
return NA.

There are lot of things help("[") does not say, such what happens to
out-of-range indices.  It is in the reference given there (p.358), as
well as in the R Language Reference, section 3.4.1 in the version I looked
at.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list