[R] geneation

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Thu Feb 20 15:50:15 CET 2014


I am baffled why you have gone so far down this road, Ted. Considerable effort has gone into making prediction of value N of the RNG sequence unrelated to value N-1 of the sequence as long as you don't know the internal state of the RNG. This is true for both the R internal RNG and the platform-dependent /dev/urandom ("similarity" of successive reads from /dev/urandom is only true by a very subtle definition of similarity... most users would not see any relationship between them because of the hidden state information). There is no benefit to dipping into the clock again mid-stream if your  subsequent analysis is truly unrelated to the previous analysis, and if it is related then you should not be breaking the sequence.

In any case where you need multiple sets of random y values while keeping the same set of x values, just keep the first set of x values in memory while you continue to generate new sets of y values. Many decades of users have found the existing RNG functional interface sufficient in multiple analysis environments. If you are dissatisfied with the randomness supplied to you by the standard functions then you should be using custom RNGs and/or hardware/OS-specific entropy sources, not mucking around with set.seed() or expecting set.seed() to do something it was not designed to do.

To the OP: Often people say that you only need to call set.seed() once per session, but I think it it is better to think of it as a group of reproducible simulations. I often re-run set.seed() several times during the same R session to maintain consistency as I get a set of simulations working, but there is simply no need to call it as a way to introduce more randomness into the RNG sequence within a single set of related randomly-generated data.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

On February 20, 2014 4:00:10 AM PST, Ted.Harding at wlandres.net wrote:
>[see at end]
>On 20-Feb-2014 10:47:50 Rui Barradas wrote:
>> Hello,
>> 
>> I'm not sure I understand the question. When you use set.seed, it
>will 
>> have effect in all calls to the random number generator following it.
>So 
>> the value for y is also fixed.
>> As for your code, you don't need the second set.seed. And though it
>is 
>> not syntatically incorrect, the way you are coding it is not very
>usual. 
>> Try instead
>> 
>> set.seed(100)
>> x <- 10*runif(10)
>> x
>> 
>> y <- rnorm(10)
>> y
>> 
>> y <- rnorm(10)  #different y
>> y
>> 
>> 
>> Hope this helps,
>> 
>> Rui Barradas
>> 
>> Em 20-02-2014 07:05, IZHAK shabsogh escreveu:
>>> how do i use set.seed? for example i want to generate fix x with
>different
>>> value of y each time i.e
>>>
>>> genarate x<-rnorm(10)
>>> generate y<-rnorm(10)
>>> i want have x fix but y changes at each iteration. this what i try
>but is
>>> not working
>>>
>>>
>>> {
>>> set.seed(100)
>>>
>>> x<-10*runif(10)
>>> }
>>> x
>>> set.seed(y<-rnorm(10))
>>> y
>
>It seems clear that Izhak seeks to detach the random generation of y
>from the random generation of x after using set.seed(). On my reading
>of
>  ?RNG
>once set.seed has been used, as RUI says, it affects all subsequent
>calls to the generator. Initially, however:
>
>  Note:
>  Initially, there is no seed; a new one is created from the current
>  time when one is required.  Hence, different sessions started at
>  (sufficiently) different times will give different simulation
>  results, by default.
>
>But, even so, it still seems (perhaps) that using a RNG without the
>call to set.seed() will still establish a seed (and its consequences)
>for that session (in effect there is an implicit call to set.seed()).
>
>This leads me to suggest that a useful innovation could be to add a
>feature to set.seed() so that
>
>  set.seed(NULL)
>
>(which currently generates an error) would undo the effect of any
>previous (explicit or implicit) call to set.seed() so that, for
>instance,
>
>  set.seed(100)
>  x<-10*runif(10)
>
>  set.seed(NULL)
>  y <- rnorm(10)
>
>would result in y being generated from a seed which was set from the
>system clock. There is no form of argument to set.seed() which
>instructs
>it to take its value from the system clock (or other sourc of external
>random events); and indeed it seems that only the system clock is
>available.
>
>On Linux/UNIX systems, at least, there is a possible accessible source
>of external randomness, namely '/dev/random', and its companion
>'/dev/urandom'. This accumulates random noise from high-resolution
>timings
>of system events like key-presses, mouse-clicks, disk accesses, etc.,
>which take place under external influences and are by nature irregular
>in
>timings.
>
>The difference between /dev/random and /dev/urandom is that one read
>from /dev/random effectively resets it, and further reads may be
>blocked
>until sufficient new noise has accumulated; while repeated reads from
>/dev/urandom are always possible (though with short time-lapses there
>may not be much difference between successive reads).
>
>The basic mechanism for this is via the command 'dd', on the lines of
>
>  dd if=/dev/urandom of=newseed count=1 bs=32
>
>which makes one ("count=1") read from input file ("if") /dev/urandom
>of 32 bytes ("bs=32") into the output file ("of") newseed.
>
>When I ran the above command on my machine just now, and inspected the
>results in hex notation ('od -x newseed') I got (on-screen):
>
>od -x newseed
>0000000 4be9 7634 41cf 5e17 b068 7898 879e 8b5f
>0000020 fb4f 52e6 59ef 0b58 5258 4a3a df04 c18d
>0000040
>
>where the initial "0000000" etc. denote byte-counts to the beginning
>of the current line (expressed also in hex); so the actual byte content
>of newseed is:
>
>  4b e9 76 34 41 cf 5e 17 b0 68 78 98 87 9e 8b 5f
>  fb 4f 52 e6 59 ef 0b 58 52 58 4a 3a df 04 c1 8d
>
>This could be achieved via a system() call from R; and the contents
>of newseed would then need to be converted into a format suitable
>for use as argument to set.seed().
>
>For the time being (not having time just now) I leave the details
>to others ...
>Ted.
>
>-------------------------------------------------
>E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
>Date: 20-Feb-2014  Time: 12:00:07
>This message was sent by XFMail
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list