[R] fast mkChar

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Jun 9 00:34:36 CEST 2004


"Vadim Ogranovich" <vograno at evafunds.com> writes:

> I am no expert in memory management in R so it's hard for me to tell
> what is and what is not doable. From reading the code of allocVector()
> in memory.c I think that the critical part is to vectorize
> CLASS_GET_FREE_NODE and use the vectorized version along the lines of
> the code fragment below (taken from memory.c).
> 
> 	if (node_class < NUM_SMALL_NODE_CLASSES) {
> 	    CLASS_GET_FREE_NODE(node_class, s); 
> 
> If this is possible than the rest is just a matter of code refactoring.
> 
> By vectorizing I mean writing a macro CLASS_GET_FREE_NODE2(node_class,
> s, n) which in one go allocates n little objects of class node_class and
> "inscribes" them into the elements of vector s, which is assumed to be
> long enough to hold these objects.
> 
> If this is doable than the only missing piece would be a new function
> setChar(CHARSXP rstr, const char * cstr) which copies 'cstr' into 'rstr'
> and (re)allocates the heap memory if necessary. Here the setChar() macro
> is safe since s[i]-s are all brand new and thus are not shared with any
> other object.

I had a similar idea initially, but I don't think it can fly: First,
allocating n objects at once is not likely to be much faster than
allocating them one-by-one, especially when you consider the
implications of having to deal with near-out-of-memory conditions.
Second, you have to know the string lengths when allocating, since the
structure of a vector object (CHARSXP) is a header immediately
followed by the data.

A more interesting line to pursue is that - depending on what it
really is that you need - you might be able to create a different kind
of object that could "walk and quack" like a character vector, but is
stored differently internally. E.g. you could set up a representation
that is just a block of pointers, pointing to strings that are being
maintained in malloc-style.

Have a look at External pointers and finalization.


-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list