[R] Question about "sample" function and inconsistent results I am getting across machines.

Fomby, Tom t|omby @end|ng |rom m@||@@mu@edu
Sun May 3 21:58:50 CEST 2020


Thank you so much Duncan.  I will pitch in.  Tom


________________________________
From: Duncan Murdoch <murdoch.duncan using gmail.com>
Sent: Sunday, May 3, 2020 2:56 PM
To: Fomby, Tom; r-help using R-project.org
Subject: Re: [R] Question about "sample" function and inconsistent results I am getting across machines.

On 03/05/2020 3:43 p.m., Fomby, Tom wrote:
>
> Dear Duncan,
>
> OK, I will certainly ask my students to download the most recent version
> of Basic R at the first of each semester and, just to be safe, include
> the RNGkind(sample.kind="Rejection") command before the students get
> started on the data partitioning part of their exercise using the sample
> function.

Actually, it would probably be a better idea to say

RNGkind(kind = "default", normal.kind = "default", sample.kind = "default")

in case bugs are found in any of the current algorithms and they change
again.

>
> By the way, how is it that one can take a membership in the R community
> so as to provide support for volunteers like yourself.

The R Foundation accepts donations to become a "Supporting Member"; see
here:  https://www.r-project.org/foundation/donors.html.  They sponsor
various events, so that is one way.  There is probably also a local user
group somewhere near you that would appreciate contributions of some
sort.  There's a list of those here:
https://blog.revolutionanalytics.com/local-r-groups.html, and another
one here:  https://www.meetup.com/pro/r-user-groups/.  (I haven't
checked how similar those two lists are.)

Duncan Murdoch


>
> Thank you,
>
> Tom Fomby
>
> Department of Economics
>
> SMU
>
> Dallas, TX 75275
>
>
>
> ------------------------------------------------------------------------
> *From:* Duncan Murdoch <murdoch.duncan using gmail.com>
> *Sent:* Sunday, May 3, 2020 2:32 PM
> *To:* Fomby, Tom; r-help using R-project.org
> *Subject:* Re: [R] Question about "sample" function and inconsistent
> results I am getting across machines.
> On 03/05/2020 1:39 a.m., Fomby, Tom wrote:
>> Please consider the following code:
>>
>> set.seed(1)
>>
>> train.index = sample(181,150)
>> head(train.index)
>> # [1]  49  67 103 162  36 159  Result from my ASUS computer
>> #
>> # [1]  68 167 129 162 43 14  Result from my wife's HP Pavilion computer
>>
>> In both cases, version 3.6.3 of R are being used.
>>
>> In addition, of the 20 students in my Predictive Analytics class, 14 got the first result while 6 got the latter result.  These results do not seem to be specific to MAC (OS) versus PC (Windows).  In several cases, students using 3.6.3 got differing results. This makes grading of homework challenging not knowing which partitions
> of the data are being used by the student.
>>
>> Thank you for considering my question.
>
> Likely some of you are storing and restoring workspaces, and have been
> doing so for a long time.  If you type
>
> RNGkind()
>
> what you should see is
>
> [1] "Mersenne-Twister" "Inversion"        "Rejection"
>
> but if the .Random.seed is restored from an old session, you might see
>
> [1] "Mersenne-Twister" "Inversion"        "Rounding"
>
> The latter uses the buggy version of sample().  Those users should run
>
> RNGkind(sample.kind = "Rejection")
>
> to start using the corrected sampling algorithm.  (The default was
> changed in R 3.6.0, but if you saved your seed from a previous version,
> you'd get the old sampler).
>
> They should also stop reloading old workspaces, but that's another
> discussion.
>
> Duncan Murdoch


	[[alternative HTML version deleted]]



More information about the R-help mailing list