[Rd] simulate in stats

Thu Sep 15 22:18:28 CEST 2005

I agree: no function should per default touch the random number  
stream. Otherwise this will undoubtedly lead to misuse. And while one  
may want to include a seed argument in case a user wants to set it  
explicitly, I would argue that the preferred usage is to do
   set.seed(SOMETHING)
   someFunction()
and then educate users that this is the way to go.

Kasper

On Sep 15, 2005, at 9:07 AM, Paul Gilbert wrote:

> BTW, I think there is a problem with the way the argument "seed" is  
> used
> in the new simulate in stats.  The problem is that repeated calls to
> simulate using the default argument will introduce a new pattern into
> the RNG:
>
>
>> stats:::simulate
>>
> function (object, nsim = 1, seed = as.integer(runif(1, 0,
> .Machine$integer.max)),   ...)
> UseMethod("simulate")
> <environment: namespace:stats>
>
>
>
>> stats:::simulate.lm
>>
> function (object, nsim = 1, seed = as.integer(runif(1, 0,
> .Machine$integer.max)),    ...)
> {
>     if (!exists(".Random.seed", envir = .GlobalEnv))
>         runif(1)
>     RNGstate <- .Random.seed
>     set.seed(seed)
>   ...
>
> This should not be done, as the resulting RNG has not been studied or
> proven. A better mechanism is  to have a default argument equal NULL,
> and not touch the seed in that case. There are several examples of  
> this
> in the package dse1 (in bundle dse),  see for example simulate.ARMA  
> and
> simulate.SS. They also use the utilities in the setRNG package to save
> more of the information necessary to reproduce simulations. Roughly it
> is done like this
> simulate.x <- function (model, rng = NULL,  ...)
>   {if (is.null(rng)) rng <- setRNG() #returns the RNG setting to be
> saved with the result
>     else {
>         old.rng <- setRNG(rng)
>         on.exit(setRNG(old.rng))
>         }
>    ...
>
>
> The seed by itself is not very useful if the purpose is to be able to
> reproduce things, and I think it would be a good idea to  
> incorporate the
> few small functions setRNG into stats (especially if the simulate
> mechanism is being introduced).
>
> The argument "nsim" presumably alleviates to some extent the above
> concern about changing the RNG pattern. However, in my fairly  
> extensive
> experience it is not very workable to produce all the simulations and
> then do the analysis of them. In a Monte Carlo experiment the  
> generated
> data set is just too big. A better approach is to do the analysis and
> save only necessary information after each simulation. That is the
> approach, for example, in dse2:::EstEval.
>
> Paul
>
> Paul Gilbert wrote:
>
>
>> Can the arguments nsim and seed be passed as part of ... in the new
>> simulate generic in R-2.2.0alpha package stats?
>>
>> This would potentially allow me to use the stats generic rather than
>> the one I define in dse. There are contexts where nsim and seed do  
>> not
>> make sense. I realize that the default arguments could be ignored,  
>> but
>> it does not really make sense to introduce a new generic with that in
>> mind. (I would also prefer that the "object" argument was called
>> "model" but this is less important.)
>>
>> Paul Gilbert
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>