[R] sum() with na.rm=TRUE, again

Thu Apr 25 17:25:32 CEST 2002

Hi:

	I remember a post several days ago by Jon Baron, concerning the
behavior of sum() when one sets na.rm=TRUE:
the result will be a zero sum for a vector of all NA's, as here, for the
second row:

> ss<- data.frame(x=c(1,NA,3,4),y=c(2,NA,4,NA))
> ss
   x  y
1  1  2
2 NA NA
3  3  4
4  4 NA

> apply(ss,1,sum,na.rm=TRUE)
1 2 3 4 
3 0 7 4 

I am rather alarmed by that zero, because I was just about to place the sum
function into am apply() on a rather large data management project, where
about 5% of my matrix rows have two missing values.  Is there a "safe" way
to use sum(), so that such zeroes are not created?  A safe.sum() that takes
arguments just as general as sum()?  I mean, I think I could get around this
little problem like this,

apply(ss,1,function(x){ifelse(all(is.na(x)),NA,sum(!is.na(x))*mean(x,na.rm=T
RUE))})
 1  2  3  4 
 3 NA  7  4 

but is there a safer way to write a sum() function?  Or, do these zeroes
serve some purpose that I am missing?
Thanks in advance...

Tom
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._