[Rd] Creating a package which contains stand-alone C code

Sat Apr 28 15:35:51 CEST 2012

On Apr 27, 2012, at 9:54 AM, Rajen Shah wrote:

> I would like to create an R package which uses some C code, which in
> turn uses MPI. At the moment I'm only interested in creating this
> package for UNIX-like systems. The way I envisage this working is for
> the R package to contain an R function which uses the system call to
> run the C code as a separate process (passing to the C code the
> location of a file of data). The C code can then do what it needs to
> do with the data, send its output to a file, and when it has finished
> R can read the output from the file.
> 
> My question is, is it possible to create an R package which contains C
> code which can then be called by an R function in that package using
> the system call? The obstacles seem to be (i) compiling this C code in
> the right way for it to be called by system, and (ii) giving the R
> function responsible for calling the C code (via system) the location
> of the executable.
> 
>> From my understanding of the R extensions manual (i) can be solved
> using careful configure script and Makevars file (though I don't know
> the details),

Simply add a target for your executable to Makevars.

> and (ii) would require me to provide an R script
> `src/install.libs.R’, which would need to copy the executable to the
> right place, and modify my R function which uses the system call so it
> knows where to look for the executable.
> 

Yes, you can have a look at Rserve (preferably the development version) for both of the above. It is not the perfect example (because it grew organically and is a bit more complex) but it does exactly that.

> The reason I’m interested in calling the C code in this peculiar way,
> rather than using the .C interface, for example, is that I’m worried
> about using MPI from within R. At least, my knowledge of both R and
> MPI is insufficient to be confident that calling MPI from within R
> will run smoothly. Also, this way I can debug the C program entirely
> on its own.
> 

Conceptually, you can achieve the same thing without another executable by forking and calling the main() function of your program -- that way you don't need another executable yet you can compile your code either as a stand-alone program (for testing) or as a package (for deployment):

SEXP call_main(SEXP args) {
   int argc = LENGTH(args), i;
   pid_t pid;
   char **argv = (char**) calloc(argc + 1, sizeof(char*));
   for (i = 0; i < argc; i++) argv[i] = CHAR(STRING_ELT(args, i));
   if ((pid = fork()) == 0) { main(argc, argv); exit(0); }
   return ScalarInteger(pid);
}

and PKG_CPPFLAGS=-Dmain=prog_main make sure you re-map main for the package in case it conflicts with R.

In general, you can do better and pass your data directly -- just define some interface in your program -- it will be much more efficient than going through files. Then your main() for the stand-alone program will read in the files and call that interface whereas R will call it directly.

Cheers,
Simon

> I realise there is an R package, Rmpi, which provides a wrapper for
> most of the MPI functions, but since all my code will be in C, it
> seems less sensible to make use of this, though I may be wrong.
> 
> Thank you for taking to the time to read this, and I very much
> appreciate any advice, (especially) even if it is to say that my
> proposed approach is entirely daft and I should do things completely
> differently. Also, if you know of any examples of packages which do
> what I’ve described above, I’d be very glad to know (it seems Sjava
> does something like this?).
> 
> Best wishes,
> 
> Rajen Shah
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
>