[R] Difference between 32-bit and 64-bit version

Thu Jun 4 09:59:01 CEST 2015

Dear Duncan,

I had been thinking about FAQ 7.31. I tried to create a dummy dataset with
the same structure to replicate the problem with the need of sending my
dataset. However all of them gave identical() results between 32-bit and
64-bit. Note that coef()$fRow is a 1266 x 6 data.frame. Is it correct to
infer that tiny difference between 32-bit and 64-bit are possible but have
a low probability of occurring?

signif() makes indeed more sense than round(). Using 20 digits gives
identical results, 21 digits gives non identical results.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-06-03 18:09 GMT+02:00 Duncan Murdoch <murdoch.duncan op gmail.com>:

> On 03/06/2015 11:56 AM, Thierry Onkelinx wrote:
> > Dear all,
> >
> > I'm a bit puzzled by the difference in an object when created in R 32-bit
> > and R 64-bit.
> >
> > Consider the code below. test.rda is available at
> >
> https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing
> >
> > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> > library(lme4)
> > load("test.rda")
> > coef.32 <- coef(test)
> > save(coef.32, file = "32bit.rda")
> >
> > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> > library(lme4)
> > load("~/test.rda")
> > coef.64 <- coef(test)
> > save(coef.64, file = "64bit.rda")
> >
> >
> > # Compare the results
> > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> > library(lme4)
> > load("32bit.rda")
> > load("64bit.rda")
> > identical(coef.32, coef.64) # FALSE
> > identical(coef.32$fRow, coef.64$fRow) # FALSE
> > identical(coef.32$fLocation, coef.64$fLocation) # TRUE
> > identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE
> >
> > The first comparison is FALSE, because the second is FALSE. But why is
> the
> > second FALSE and the third and fourth TRUE?
> >
> > My goal is the calculate a SHA1 hash on the coef(test) to track if the
> > coefficients of test have changed. I'd like to get the same hash on a
> > 32-bit and 64-bit system. A simple hack would be to calculate the hash on
> > round(coef(test), 20). Is that a good or bad idea?
> >
> > identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE
>
> Different math libraries round differently, so small differences are
> expected.  This is FAQ 7.31.  In many cases the 32 bit calculations are
> more accurate, because they tend to use more 80 bit extended precision
> intermediate values, but that is not guaranteed.
>
> Rounding before comparing makes sense, but I would use signif() instead
> of round(), I would choose a relatively small number of significant
> digits, and I would expect to see a few false positives:  if the true
> value is 0 but some "random" noise is added, I'd expect values rounded
> by signif() to be unequal.
>
> Duncan Murdoch
>
> >
> > Best regards,
> >
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and
> > Forest
> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> > Kliniekstraat 25
> > 1070 Anderlecht
> > Belgium
> >
> > To call in the statistician after the experiment is done may be no more
> > than asking him to perform a post-mortem examination: he may be able to
> say
> > what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data. ~ Roger Brinner
> > The combination of some data and an aching desire for an answer does not
> > ensure that a reasonable answer can be extracted from a given body of
> data.
> > ~ John Tukey
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help op r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>

	[[alternative HTML version deleted]]