[Rd] looking for adice on bigmemory framework with C++ and java interoperability

Jay Emerson jayemerson at gmail.com
Sat May 5 14:44:59 CEST 2012


On 4 May 2012 at 22:31, andre zege wrote:
| Simon,  thanks for your comment. I guess there is no problem, i am
| apparently being lazy/busy and wondered if there is ready code that does
| it. You are right, i suppose -- i'll look at the c++ code for bigmatrix and
| will try to hack a solution.

> You may want to look at the documentation for 'external pointers' in the
> "Writing R Extensions" manual, and then consider at Rcpp::XPtr which > provides
> an Rcpp-based route to using external pointers.

It's nice having others answering our questions before we can -- many
thanks Simon/Dirk!

A big.matrix of dimension RxC is a column-major binary file of R*C
elements of size 1, 2, 4, or 8 bytes, depending on the type of atomic
element.  Period, end of story, no header to worry about.  So you can
use it as you like from any language.  Whether you can mmap it
conveniently (if needed in shared memory or larger-than-RAM
applications) is another story.  We make use of the BOOST interprocess
library for this.

For working in R, the existing R API should be sufficient (though
could always be expanded).

For working in C++, the C++ API is pretty low-level and of course
could benefit from ultimately being Rpp-ified, for example.  There are
plenty of examples of working in C++ inside
bigmemory/biganalytics/bigtabulate.

For Java... well, I don't code in Java.  You can certainly make use of
the data structure easily enough, but whether you can make use of the
existing C++ API is something I simply can't answer.

I note that one really cool trick is when you have data from another
source (e.g. many satellite images) which is already a simple binary
file.  You can do a trivial hack to create a big.matrix descriptor
file, and attach.big.matrix() to it immediately.  No traditional
read.*() is necessary, and it is super fast.

Jay

-- 
John W. Emerson (Jay)
Associate Professor of Statistics
Department of Statistics
Yale University
http://www.stat.yale.edu/~jay



More information about the R-devel mailing list