[Rd] portableParalleSeeds Package violation, CRAN exception?

Paul Johnson pauljohn32 at gmail.com
Wed Aug 6 20:10:32 CEST 2014


I'm writing to ask for a policy exception, or advice on how to make
this package CRAN allowable.

http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz

Yesterday I tried to submit a package on CRAN and Dr Ripley pointed
out that I had not understood the instructions about packages.  Here's
the part where the R check gives a Note

* checking R code for possible problems ... NOTE
Found the following assignments to the global environment:
File ‘portableParallelSeeds/R/initPortableStreams.R’:
   assign("currentStream", n, envir = .GlobalEnv)
   assign("currentStates", curStates, envir = .GlobalEnv)
   assign("currentStream", 1L, envir = .GlobalEnv)
   assign("startStates", runSeeds, envir = .GlobalEnv)
   assign("currentStates", runSeeds, envir = .GlobalEnv)
   assign("currentStream", as.integer(currentStream), envir = .GlobalEnv)
   assign("startStates", runSeeds, envir = .GlobalEnv)
   assign("currentStates", runSeeds, envir = .GlobalEnv)

Altering the user's environment requires a special arrangement with
CRAN. I believe this is justified, I'll sketch the reasons now. But,
mostly, I'm at your mercy and if there is any way to make this
possible, I would be very grateful.

To control & replace random number streams, it really is necessary to
alter the workspace. That's where the random generator state is
stored.  It is acknowledged in Robert Gentleman' s Book, R Programming
for Bionformatics "The decision to have these [random generator]
functions manipulate a global variable, .Random.seed, is slightly
unfortunate as it makes it somewhat more difficult to manage several
different random number streams simultaneously” (Gentleman, 2009, p.
201).

I have developed an understandable set of wrapper functions that handle this.

Some of you may recall this project. I've asked about it here a couple
of times. We allow separate streams of randoms for different purposes
within a single R run. There is a framework to save 1000s of those
sets in a file, so it can be used on a cluster or in a single
workstation.  This is handy because, when 1 run in 10,000 on the
cluster exhibits some weird behavior, we can easily re-initiate that
interactively and see what's going on.

I have a  vignette "pps" that explains. I dropped a copy of that here
in case you don't want to get the package:

http://pj.freefaculty.org/scraps/pps.pdf

While working on that, I gained a considerably deeper understanding of
random generators and seeds.  That is what this vignette is about

http://pj.freefaculty.org/scraps/PRNG-basics.pdf


We've been running simulations on our cluster with the
portableParallelSeeds framework for 2 years, we've never had any
trouble.  We are able to re-start runs, verify random number draws in
separate streams.

PJ
-- 
Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu



More information about the R-devel mailing list