[R] Architecting an optimization with external calls

Tue Nov 4 21:57:14 CET 2003

I have a likelihood I would like to compute using C++ and then
optimize.  It has data that need to persist across individual calls to
the likelihood.  I'd appreciate any recommendations about the best way
to do this.  There are several, related issues.

1. Use the R optimizer or a C optimizer?
Because of the persistence problems (see below), using a C optimizer has
a certain attraction.  However, the C methods described in 5.8 of the
"Writing R Extensions" include the caveat that "No function is provided
for finite-differencing, nor for approximating the Hessian at the
result."  That's a big drawback, since I need that information. 
(Probably I will be doing this without analytic derivatives.)

2. How to persist the data?
I think my preferred approach would be to pass data back to R (assuming
the "optimize with R" approach above), and then pass it on to subsequent
calls.  The data would be the top of an object graph (i.e., there are
pointers to disconnected chunks of memory) and it is not clear to me how
to do this.  First, the documentation doesn't indicate any "opaque" data
type; should I use character (STRXP)?  Second, I'm not sure how to
protect it and the other chunks of memory.  Does each one need to go
inside a PROTECT call?  And is it safe to have one invocation from R do
PROTECT, and another much later one do UNPROTECT (all the examples I saw
had both calls within the same function invocation).

My hope is that if I allocate an object outside of R and don't tell R
about it, R will never touch it.  So I only need PROTECT for something
going back to R.  True?

Also, the docs say not to protect too many items; there may be a lot. 
So I'd probably end up having to write my own alloc out of pools that
were protected, and that's just another layer of junk in terms of the
original problem.

Another approach would be to just hang the data somewhere in the global
space of the shared library.  On general principles this is a poor
approach ("don't use globals"), manifest in specific failings such as
lack of thread safety.  I also suspect the issues with getting that to
work portably are probably considerable (as in, it may not be possible).

P.S. The example of Zero-finding (4.9.1 in "Writing R Extensions") is,
unfortunately, the reverse of this case.  In the example, the function
to be optimized is in R, while the optimizer is in C.
-- 
Ross Boylan                                      wk:  (415) 502-4031
530 Parnassus Avenue (Library) rm 115-4          ross at biostat.ucsf.edu
Dept of Epidemiology and Biostatistics           fax: (415) 476-9856
University of California, San Francisco
San Francisco, CA 94143-0840                     hm:  (415) 550-1062