[Rd] Speed-up/Cache loadNamespace()

Sun Jul 19 22:07:42 CEST 2020

Mario,

On unix if you use Rseve you can pre-load all packages in the server (via eval config directive or by running Rserve::run.Rserve() from a session that has everything loaded) and all client connections will have the packages already loaded and available* immediately. You could replace Rscript call with a very tiny Rserve client program which just calls source(""). I can give you more details if you're interested.

Cheers,
Simon

* - there are some packages that are inherently incompatible with fork() - e.g. you cannot fork Java JVM or open connections.

> On Jul 20, 2020, at 6:47 AM, Mario Annau <mario.annau using gmail.com> wrote:
> 
> Thanks for the quick responses. As you both suggested storing the packages
> to local drive is feasible but comes with a size restriction I wanted to
> avoid. I'll keep this in mind as plan B.
> @Hugh: 2. would impose even greater slowdowns and 4. is just not feasible.
> However, 3. sounds interesting - how would this work in a Linux environment?
> 
> Thank you,
> Mario
> 
> 
> Am So., 19. Juli 2020 um 20:11 Uhr schrieb Hugh Parsonage <
> hugh.parsonage using gmail.com>:
> 
>> My advice would be to avoid the network in one of the following ways
>> 
>> 1. Store installed packages on your local drive
>> 2. Copy the installed packages to a tempdir on your local drive each time
>> the script is executed
>> 3. Keep an R session running in perpetuity and source the scripts within
>> that everlasting session
>> 4. Rewrite your scripts to use base R only.
>> 
>> I suspect this solution list is exhaustive.
>> 
>> On Mon, 20 Jul 2020 at 1:50 am, Mario Annau <mario.annau using gmail.com> wrote:
>> 
>>> Dear all,
>>> 
>>> in our current setting we have our packages stored on a (rather slow)
>>> network drive and need to invoke short R scripts (using RScript) in a
>>> timely manner. Most of the script's runtime is spent with package loading
>>> using library() (or loadNamespace to be precise).
>>> 
>>> Is there a way to cache the package namespaces as listed in
>>> loadedNamespaces() and load them into memory before the script is
>>> executed?
>>> 
>>> My first simplistic attempt was to serialize the environment output
>>> from loadNamespace() to a file and load it before the script is started.
>>> However, loading the object automatically also loads all the referenced
>>> namespaces (from the slow network share) which is undesirable for this use
>>> case.
>>> 
>>> Cheers,
>>> Mario
>>> 
>>>        [[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>