[R] Difference between 32-bit and 64-bit version

Thierry Onkelinx thierry.onkelinx at inbo.be
Thu Jun 4 12:11:43 CEST 2015


"low probability of occurring" was just statisticians lingo for "rare" ;-)


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-06-04 11:53 GMT+02:00 Duncan Murdoch <murdoch.duncan op gmail.com>:

> On 04/06/2015 3:59 AM, Thierry Onkelinx wrote:
> > Dear Duncan,
> >
> > I had been thinking about FAQ 7.31. I tried to create a dummy dataset
> > with the same structure to replicate the problem with the need of
> > sending my dataset. However all of them gave identical() results between
> > 32-bit and 64-bit. Note that coef()$fRow is a 1266 x 6 data.frame. Is it
> > correct to infer that tiny difference between 32-bit and 64-bit are
> > possible but have a low probability of occurring?
>
> Differences are rare, but it's hard to assign a probability to them.
>
> Duncan Murdoch
>
> >
> > signif() makes indeed more sense than round(). Using 20 digits gives
> > identical results, 21 digits gives non identical results.
> >
> > Best regards,
> >
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> > and Forest
> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> > Kliniekstraat 25
> > 1070 Anderlecht
> > Belgium
> >
> > To call in the statistician after the experiment is done may be no more
> > than asking him to perform a post-mortem examination: he may be able to
> > say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data. ~ Roger Brinner
> > The combination of some data and an aching desire for an answer does not
> > ensure that a reasonable answer can be extracted from a given body of
> > data. ~ John Tukey
> >
> > 2015-06-03 18:09 GMT+02:00 Duncan Murdoch <murdoch.duncan op gmail.com
> > <mailto:murdoch.duncan op gmail.com>>:
> >
> >     On 03/06/2015 11:56 AM, Thierry Onkelinx wrote:
> >     > Dear all,
> >     >
> >     > I'm a bit puzzled by the difference in an object when created in R
> >     32-bit
> >     > and R 64-bit.
> >     >
> >     > Consider the code below. test.rda is available at
> >     >
> >
> https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing
> >     >
> >     > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> >     > library(lme4)
> >     > load("test.rda")
> >     > coef.32 <- coef(test)
> >     > save(coef.32, file = "32bit.rda")
> >     >
> >     > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> >     > library(lme4)
> >     > load("~/test.rda")
> >     > coef.64 <- coef(test)
> >     > save(coef.64, file = "64bit.rda")
> >     >
> >     >
> >     > # Compare the results
> >     > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> >     > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> >     > library(lme4)
> >     > load("32bit.rda")
> >     > load("64bit.rda")
> >     > identical(coef.32, coef.64) # FALSE
> >     > identical(coef.32$fRow, coef.64$fRow) # FALSE
> >     > identical(coef.32$fLocation, coef.64$fLocation) # TRUE
> >     > identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE
> >     >
> >     > The first comparison is FALSE, because the second is FALSE. But
> >     why is the
> >     > second FALSE and the third and fourth TRUE?
> >     >
> >     > My goal is the calculate a SHA1 hash on the coef(test) to track if
> the
> >     > coefficients of test have changed. I'd like to get the same hash
> on a
> >     > 32-bit and 64-bit system. A simple hack would be to calculate the
> >     hash on
> >     > round(coef(test), 20). Is that a good or bad idea?
> >     >
> >     > identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE
> >
> >     Different math libraries round differently, so small differences are
> >     expected.  This is FAQ 7.31.  In many cases the 32 bit calculations
> are
> >     more accurate, because they tend to use more 80 bit extended
> precision
> >     intermediate values, but that is not guaranteed.
> >
> >     Rounding before comparing makes sense, but I would use signif()
> instead
> >     of round(), I would choose a relatively small number of significant
> >     digits, and I would expect to see a few false positives:  if the true
> >     value is 0 but some "random" noise is added, I'd expect values
> rounded
> >     by signif() to be unequal.
> >
> >     Duncan Murdoch
> >
> >     >
> >     > Best regards,
> >     >
> >     > ir. Thierry Onkelinx
> >     > Instituut voor natuur- en bosonderzoek / Research Institute for
> Nature and
> >     > Forest
> >     > team Biometrie & Kwaliteitszorg / team Biometrics & Quality
> Assurance
> >     > Kliniekstraat 25
> >     > 1070 Anderlecht
> >     > Belgium
> >     >
> >     > To call in the statistician after the experiment is done may be no
> more
> >     > than asking him to perform a post-mortem examination: he may be
> able to say
> >     > what the experiment died of. ~ Sir Ronald Aylmer Fisher
> >     > The plural of anecdote is not data. ~ Roger Brinner
> >     > The combination of some data and an aching desire for an answer
> does not
> >     > ensure that a reasonable answer can be extracted from a given body
> of data.
> >     > ~ John Tukey
> >     >
> >     >       [[alternative HTML version deleted]]
> >     >
> >     > ______________________________________________
> >     > R-help op r-project.org <mailto:R-help op r-project.org> mailing list --
> >     To UNSUBSCRIBE and more, see
> >     > https://stat.ethz.ch/mailman/listinfo/r-help
> >     > PLEASE do read the posting guide
> >     http://www.R-project.org/posting-guide.html
> >     > and provide commented, minimal, self-contained, reproducible code.
> >     >
> >
> >
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list