[R] 2 small problems: integer division and the nature of NA

Gabor Grothendieck ggrothendieck at myway.com
Fri Feb 4 20:48:44 CET 2005


Denis Chabot <chabotd <at> globetrotter.net> writes:
: The sum of a vector having at least one NA but also valid data gives NA 
: if we do not specify na.rm=T. But with na.rm=T, we are telling sum to 
: give the sum of valid data, ignoring NAs that do not tell us anything 
: about the value of a variable. I found out while getting the sum of 
: small subsets of my data (such as when subsetting by several 
: variables), sometimes a "cell" only contained NAs for my response 
: variable. I would have expected the sum to be NA in such cases, as I do 
: not have a single data point telling me the value of my response here. 
: But R tells me the sum was zero in that cell! Was this behavior 
: considered "desirable" when sum was built? If not, any hope it will be 
: fixed?

Think of it this way: If u and v are index vectors then its desirable that

	sum(x[u]) + sum(x[v]) == sum(x[c(u,v)])

hold for zero length index vectors too in which case
sum(numeric()) should be zero, not NA.

If you want a short expression that gives NA for zero length x try this:

        sum(x) + if (length(x)) 0 else NA

or define your own function, sum0, say.




More information about the R-help mailing list