[Rd] R 'base' returning 0 as sum of NAs

Duncan Murdoch murdoch.duncan at gmail.com
Wed Jan 11 12:50:09 CET 2017


On 11/01/2017 5:33 AM, Alex Ivan Howard wrote:
> Dear R Team
>
> The following line returns 0 (zero) as answer:
> sum(c(NA_real_, NA_real_, NA_real_, NA_real_), na.rm = TRUE)
>
> One would, however, have expected it to return 'NaN', as is the case with
> function 'mean':
>
>> mean(c(NA_real_, NA_real_, NA_real_, NA_real_), na.rm = TRUE)
> [1] NaN
>

The two expressions are long versions of

sum(numeric())
mean(numeric())

It is reasonable that an empty sum is zero.  The mean is 0/0, so NaN is 
reasonable.

If this doesn't suit your needs, then you should put in special checks 
for empty datasets.

Duncan Murdoch

> The problem in other words:
> I have a vector filled with missing numbers. I run the 'sum' function on
> it, but instruct it to remove all missing values first. Consequently, the
> sum function is left with an empty numeric vector. There is nothing to sum
> over, so it shouldn't actually be able to return a concrete numeric value?
> Shouldn't it thus rather return either NA ('unknown'/'missing') or - in the
> fashion of the mean function - NaN ('not a number')?
>
> With the current state of affairs, the sum function poses the grave danger
> of introducing zeros to one's data (and subsequently other values as well,
> as soon as the zeros get taken up in further calculations).
>
> I hope my e-mail finds you well and I wish the R team all of the best for
> 2017 :)
>
> Kind regards
>
> Alex I. Howard
>
> Web: www.nova.org.za
> Phone: +27 (0) 44 695 0749
> VoiP: +27 (0) 87 751 3490
> Fax:         +27 (0) 86 538 7958
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list