[Rd] Precision of function mean,bug?
henr|k@bengt@@on @end|ng |rom gm@||@com
Wed May 20 21:18:16 CEST 2020
On Wed, May 20, 2020 at 11:10 AM brodie gaslam via R-devel
<r-devel using r-project.org> wrote:
> > On Wednesday, May 20, 2020, 7:00:09 AM EDT, peter dalgaard <pdalgd using gmail.com> wrote:
> > Expected, see FAQ 7.31.
> > You just can't trust == on FP operations. Notice also
> Additionally, since you're implementing a "mean" function you are testing
> against R's mean, you might want to consider that R uses a two-pass
> calculation to reduce floating point precision error.
This one is important.
FWIW, matrixStats::mean2() provides argument refine=TRUE/FALSE to
calculate mean with and without this two-pass calculation;
> a <- c(x[idx],y[idx],z[idx]) / 3
> b <- mean(c(x[idx],y[idx],z[idx]))
> b == a
> b - a
> c <- matrixStats::mean2(c(x[idx],y[idx],z[idx])) ## default to refine=TRUE
> b == c
> b - c
> d <- matrixStats::mean2(c(x[idx],y[idx],z[idx]), refine=FALSE)
> a == d
> a - d
> c == d
> c - d
Not surprisingly, the two-pass higher-precision version (refine=TRUE)
takes roughly twice as long as the one-pass quick version
>  https://github.com/wch/r-source/blob/tags/R-4-0-0/src/main/summary.c#L482
> > > a2=(z[idx]+x[idx]+y[idx])/3
> > > a2==a
> >  FALSE
> > > a2==b
> >  TRUE
> > -pd
> > > On 20 May 2020, at 12:40 , Morgan Morgan <morgan.emailbox using gmail.com> wrote:
> > >
> > > Hello R-dev,
> > >
> > > Yesterday, while I was testing the newly implemented function pmean in
> > > package kit, I noticed a mismatch in the output of the below R expressions.
> > >
> > > set.seed(123)
> > > n=1e3L
> > > idx=5
> > > x=rnorm(n)
> > > y=rnorm(n)
> > > z=rnorm(n)
> > > a=(x[idx]+y[idx]+z[idx])/3
> > > b=mean(c(x[idx],y[idx],z[idx]))
> > > a==b
> > > #  FALSE
> > >
> > > For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and many
> > > others the difference is small but still.
> > > Is that expected or is it a bug?
> R-devel using r-project.org mailing list
More information about the R-devel