# [R] Sum of columns of a data frame equal to NA when all the elements are NA

I can see that one might regard having

sum( sum( 1 ), sum( NULL ) ) == sum( 1 )

be TRUE as a necessary consistency, but going down that road one might expect Bert's

v+NULL == v

for all numeric vectors also. I have always avoided that construction as poor computing practice, but if NULL is supposed to represent the empty set mathematically [1] then this would seem to follow.

[1] https://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf

On March 21, 2018 1:06:46 PM PDT, Bert Gunter <bgunter.4567 at gmail.com> wrote:
>"I see: consistency with additive identity. "
>
>Ummm, well:
>
>> 1+NULL
>numeric(0)
>
>> sum(1,NULL)
>[1] 1
>
>Of course, there could well be something here I don't get, but that
>doesn't
>look very consistent to me. However, as I said privately, so long as
>the
>corner case behavior is documented, which it is, I don't care.
>
I see: consistency with additive identity. That makes sense. Thanks.
>>
>> > No. The empty sum is zero. Adding it to another sum should not
>change
>> it. Nothing audacious about that. This is consistent; other
>definitions
>> just cause trouble.
>> >>
>> >> Surely the result of summation of non-existent values is not
>defined,
>> is it not? And since the NA values have been _removed_, there's
>nothing
>> left to sum over. In fact, pretending the the result in that case is
>zero
>> would appear audacious, no?
>> >>>
>> >>> What do you mean by "should not"?
>> >>>
>> >>> NULL means "missing object" in R. The result of the sum function
>is
>> always expected to be numeric... so NA_real or NA_integer could make
>sense
>> as possible return values. But you cannot compute on NULL so no, that
>> doesn't work.
>> >>>
>> >>> See the note under the "Value" section of ?sum as to why zero is
>> returned when all inputs are removed.
Should not the result be NULL if you have removed the NA with na.rm=TRUE ?
>> >>>> na.rm=TRUE ?
>> >>>>> Dear list users,
>> >>>>> let me ask you this trivial question. I worked on that for a
>long
>> >>>> time, by now.
>> >>>>> Suppose to have a data frame with NAs and to sum some columns
>with
>> >>>> rowSums:
>> >>>>>
>> >>>>> df <- data.frame(A = runif(10), B = runif(10), C = rnorm(10))
>> >>>>> df[1, ] <- NA
>> >>>>> rowSums(df[ , which(names(df) %in% c("A","B"))], na.rm=T)
>> >>>>>
>> >>>>> If all the elements of the selected columns are NA, rowSums
>returns 0
>> >>>> while I need NA.
>> >>>>> Is there an easy and efficient way to use rowSums within a
>function
>> >>>> like
>> >>>>>
>> >>>>> function(x) ifelse(all(is.na(x)), as.numeric(NA), rowSums...)?
>> >>>>>
>> >>>>> or an equivalent function?
>> >>>>>
>> >>>>> Thank you for your help
