[Rd] malloc/calloc/strdup and R's aequivalents

Sun Mar 18 15:38:48 CET 2012

On Sat, Mar 17, 2012 at 11:47:24PM -0400, Simon Urbanek wrote:
[...]
> You can always define Strdup() since strdup() is just a shorthand for
> malloc()+strcpy() -- in fact in R it's easier since Calloc will never return
> NULL so trivially
> #define Strdup(X) strcpy(Calloc(strlen(X)+1, char), X)
[...]

Yes, something like that I had in mind (and already implemented,
but nur used so far).
I thought maybe something might be also somehoew available as a
R-internal adaption.

> 
> But when using Calloc be aware that you have to clean up -- even if there is
> an error (which is why it is generally not a good idea to use Calloc for
> anything but memory you want to keep beyond your call).

Yes.

This seems not to be the case for R_alloc() ?

> There are two reasons for removing malloc/calloc: a) R uses custom allocators
> on systems with poor OS allocators (like Windows) so R's allocation is more
> efficient [and combining different allocators even worsens the performance] and
> b) you should think twice about the life span of your objects under error
> conditions. It is quite challenging to keep that straight, so in most cases it
> is better to use managed memory instead (but there is a performance trade-off).

Yes, good reasons.

> There are still valid reasons for not using R's allocators, but those are
> special cases that the author must be conscious about (after weighing the
> options).

Calloc() is provided by R, so it apears as being an R-allocator,
but rather seems to be a wrapper around calloc().
With R allocators you mean functions like allocVector()?
And R_alloc()?
The behaviour of the later one is not clear to me.
(A reason to prefer Calloc() here, then the behaviour
 is under my control, even of it's more "risky". That's the C-ish-way.)

> 
> 
> > mkChar seems not to be the right thing, because, what I found in the
> > above mentioned documentation, says,m that a C-string is fed in, but
> > CHARSXP is what I get as result.
> > 
> > What I try to do, is, to pick some code and port it to R.
> > If I need to rewrite a lot of stuff, just because I need
> > to adapt to R, this would be more effort than if it would be possible
> > just to have some functions, that could make porting easier.
> > 
> > For example, if I want to resuse a function, that opens a file
> > and needs a   char* stringname
> > then it would add more effort if I need to go via CHARSXP or so.
> > 
> 
> I'm not sure where you are going with this, since all strings will already
> come as CHARSXPs from R so typically you don't need to allocate anything. The
> only reason to allocate is to create persistent objects (for which you'd use
> Calloc) - your example doesn't fit here...

If I want to copy the filename into a C-struct and then pass (a pointer to)
that struct around, then using something like strdup() is a good idea.
Therwise the filename-pointer might have vanished and the pointer
points into nirvana.
That was, where it came from. I used some already existing code,
where this all made sense.

And extracting char* from the CARSXP once, instead of
using something like CHAR( STRING_ELT(filename_sexp, 0) )
at every place, where I may want to have access to that filename
makes the code clearer.

As this I reused existing code, I tried to change it
as less as possible (to make it easy).

But as I now see, for the special case here I can forget the
strdup() and maybe make my struct smaller (throwing out the filename),
because here I call the reader-function with filename as argument,
and do not pass around the structure-ptr too much.
And I store the filename in the list I give back to the user.
(I can do this even in the R-code which calls my C-code.)

So in this certain case the problem vanishes.
Nevertheless, in other cases it might be a problem.

But then maybe never touching the code and just live with the
already existing strdup()/malloc/calloc()/free()
would also work.

I can see the advantages of using R-provided functions.
But if they do not substitute all these deprecated functions,
then either porting the stand-alone-code to R is made
much effort, or the deprecated functions might just stay inside.

Do you see what I mean?

> 
> > If nothing of that stuff works, I would need to use the original
> > calloc() / free() functions, which are deprecated in the above
> > mentioned manual.
> > 
> > 
> > Ciao,
> >   Oliver
> > 
> > P.S.: Also I wonder... R-alloc() does not need a free()-like call?
> >      Is it freed automaticlaly?
> >      And at which time?
> >      In the docs is mentioned, at the end of the .C/.Call/.External call.
> >      Does this mean, this is freed/reclaimed automatically, when returning
> >      from the C-world to the R-world?
> >      At (after) return of the extension's code?
> > 
> 
> Yes. It simply uses an R object so it is subject to regular R object rules
> and thus gets cleaned up on error as well (unlike anything you use Calloc for).

This point is not so clear to me.

The prototype of R_alloc() looks like as if R_alloc() is just the same as calloc().
But now you say, it's an R-object that will be handled like any R-object.

But then... why is PROTECT not mentioned there in section 6.1.1?
Would a GC-run possibly vanish that memory?

What rules are there for it?

Is it save to use it from entry-point to my code to the return-point of my code?

So, does it behave like an environment in functional languages?

Can I allocate memory with R_alloc() somewehere  deep in my code
and pass that memory until the return of my .Call-entry-return-point?
Without GC claiming it until returning?

Can you please clarify theese points?

Ciao,
   Oliver