[R] differing behavior of mean(), median() and sd() with na.rm

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Wed Aug 22 16:47:44 CEST 2018


Actually, the dissonance is a bit more basic.

After xxx(...., na.rm=TRUE) with all NA's in ... you have numeric(0). So
what you see is actually:

> z <- numeric(0)
> mean(z)
[1] NaN
> median(z)
[1] NA
> sd(z)
[1] NA
> sum(z)
[1] 0
etc.

I imagine that there may be more of these little inconsistencies due to the
organic way R evolved over time. What the conventions should be  can be
purely a matter of personal opinion in the absence of accepted standards.
But I would look to see what accepted standards were, if any, first.

-- Bert


On Wed, Aug 22, 2018 at 7:34 AM Ivan Calandra <calandra using rgzm.de> wrote:

> Dear useRs,
>
> I have just noticed that when input is only NA with na.rm=TRUE, mean()
> results in NaN, whereas median() and sd() produce NA. Shouldn't it all
> be the same? I think NA makes more sense than NaN in that case.
>
> x <- c(NA, NA, NA) mean(x, na.rm=TRUE) [1] NaN median(x, na.rm=TRUE) [1]
> NAsd(x, na.rm=TRUE) [1] NA
>
> Thanks for any feedback.
>
> Best,
> Ivan
>
> --
> Dr. Ivan Calandra
> TraCEr, laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]




More information about the R-help mailing list