[R] [External] Function environments serialize to a lot of data until they don't

Ivan Krylov |kry|ov @end|ng |rom d|@root@org
Mon Mar 11 22:17:29 CET 2024


Dear Luke,

Thank you for the detailed explanation of the power of force()! It does
solve my problem in a much more reliable manner than setting function
environments manually.

On Fri, 8 Mar 2024 15:46:52 -0600 (CST)
luke-tierney using uiowa.edu wrote:

> Having a reference to a large environment is not much of an issue
> within a single process, but can be in a distributed memory parallel
> computing context.  To avoid this you can force evaluation of the
> promises:
> 
>      mkLL1 <- function(m, s) {
>  	force(m)
>  	force(s)
>  	function(x) sum(dnorm(x, m, s, log = TRUE))
>      }
>      ll <- f(1e7)
>      length(serialize(ll, NULL))
>      ## [1] 2146

I think this also illustrates the danger of letting side effects come
near function arguments. A promise to read a file could survive on a
cluster node and result in a lot of head-scratching. A promise to write
to the connection number N, which coincides with a connection open on
the cluster node, could even do some damage. This is definitely
something to remember when creating closures.

> A very simple tool available in the snow package for snow clusters is
> snow.time(), which can produce some summary times and a Gantt chart
> (patterned after ones produced by xpvm and xmpi).

I can see the snow.time() plot being useful. Thank you for letting me
know about it!

-- 
Best regards,
Ivan



More information about the R-help mailing list