[Rd] more on bug 7924

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Jun 5 14:02:19 CEST 2006


Hin-Tak Leung <hin-tak.leung at cimr.cam.ac.uk> writes:

> I see you have found the sexptype listing in Rinternals.h . I believe
> it was in one of R's FAQ's about R's garbage collector - it doesn't do
> proper reference-counted garbage collection as you suggested, but does
> a sort of poor man's garbage collection, by classifying entities in
> *only* 3 catergories - not-in-use, in-used-by-one, and in-used 
> by-more-than-one.

Not quite: more like freshly-made-not-assigned,
assigned-but-only-once, assigned-maybe-more-than-once. 

It's also not so much about GC as about modifiability: In the first
case, modify at will. In the 2nd case you can modify in an assignment
function. In the 3rd case, you must duplicate the object first.

Consider

f <- function(x){x[3]<-10; x}

f(rnorm(10))

b <- rnorm(10)
f(b)

In the first case, rnorm() returns an unnamed object. (Well, it could.
I'm not too sure it actually does.) When the object is passed to f(),
it gets named "x", but it is the only copy and the modification to
x[3] can proceed safely.

In the second case you first assign to b then pass b to f inside of
which it is named "x". This proceeds without duplication, so the same
object is now assigned twice. Modifying x at this point would cause b
to change as well, which would violate pass-by-value semantics. Hence,
we need to create a duplicate of x which we can safely change.

Unlike Java and Tcl, R doesn't use its refcounts for garbage
collection. Partly it is because it is not a true count that you can
decrement and use to throw away the object when the count goes to
zero. However, it is also problematic to implement in R because we can
have reference loops: Consider

g <- function(){...whatever...; e <- environment(); ...}

Now when g() is called it creates an environment to hold its local
variables, and when g finishes, the environment can be destroyed,
provided that there are no references to it from other objects. In the
above case, we do have a reference to the environment,  but it comes
from an object that is inside the environment and would be destroyed
along with it. A strict refcounting system would leave such
environments hanging around forever.

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-devel mailing list