# [R] sum() with na.rm=TRUE, again

Richards, Tom richards at upci.pitt.edu
Thu Apr 25 17:25:32 CEST 2002

```Hi:

I remember a post several days ago by Jon Baron, concerning the
behavior of sum() when one sets na.rm=TRUE:
the result will be a zero sum for a vector of all NA's, as here, for the
second row:

> ss<- data.frame(x=c(1,NA,3,4),y=c(2,NA,4,NA))
> ss
x  y
1  1  2
2 NA NA
3  3  4
4  4 NA

> apply(ss,1,sum,na.rm=TRUE)
1 2 3 4
3 0 7 4

I am rather alarmed by that zero, because I was just about to place the sum
function into am apply() on a rather large data management project, where
about 5% of my matrix rows have two missing values.  Is there a "safe" way
to use sum(), so that such zeroes are not created?  A safe.sum() that takes
arguments just as general as sum()?  I mean, I think I could get around this
little problem like this,

apply(ss,1,function(x){ifelse(all(is.na(x)),NA,sum(!is.na(x))*mean(x,na.rm=T
RUE))})
1  2  3  4
3 NA  7  4

but is there a safer way to write a sum() function?  Or, do these zeroes
serve some purpose that I am missing?