[R] binary operators that never return missing values

Duncan Murdoch murdoch.duncan at gmail.com
Thu Jun 21 01:04:05 CEST 2012


On 12-06-20 4:44 PM, Anthony Damico wrote:
> Hi, I work with data sets with lots of missing values.  We often need
> to conduct logical tests on numeric vectors containing missing values.
>   I've searched around for material and conversations on this topic,
> but I'm having a hard time finding anything.  Has anyone written a
> package that deals with this sort of thing?  All I want are a group of
> functions like the ones I've posted below, but I'm worried I'm
> re-inventing the wheel..  If they're not already on CRAN, I feel like
> I should add them.  Any pointers to work already completed on this
> subject would be appreciated.  Thanks!
>
> Anthony Damico
> Kaiser Family Foundation
>
>
>
> Here's a simple example of what I need done on a regular basis:
>
> #two numeric vectors
> a<- c( 1 , NA , 7 , 2 , NA )
>
> b<- c( NA , NA , 9 , 1 , 6 )
>
> #this has lots of NAs
> a>  b
>
> #save this result in x
> x<- (a>  b)
>
> #overwrite NAs in x with falses (which we do a lot)
> x<- ifelse( is.na( x ) , F , x )
>
> #now x has only trues and falses
> x

Not necessarily.  F is a variable; if it happens to hold the value TRUE 
or 17, then x will get that.

For your question:  I think what you're doing is a bad idea.  There are 
certain relations that hold for ">" that just don't hold for your 
function, e.g.

(a > b) is the same as !(a <= b)

(a > b) is the same as ( !(a < b) & (a != b) )

if !(a < b) and !(b < c), then !(a < c)

etc.

I think you'll find it very difficult to define the other comparison 
operators in a way that doesn't lead to strange behaviour when it 
violates these relations.  Even if you never use any other comparisons, 
your reasoning about results will end up incorrect, because these 
relations are so ingrained into our psyches.

It would probably be easier to get consistency if you treated NA as -Inf 
or +Inf, or just avoided the suggestive name:  define foo(a,b) to return 
TRUE or FALSE according to your desired rules, and don't pretend it's an 
order relation.

Duncan Murdoch



More information about the R-help mailing list