[Rd] more on bug 7924

Thomas Lumley tlumley at u.washington.edu
Mon Jun 5 16:33:31 CEST 2006


On Mon, 5 Jun 2006, Hin-Tak Leung wrote:

> I see you have found the sexptype listing in Rinternals.h . I believe
> it was in one of R's FAQ's about R's garbage collector - it doesn't do
> proper reference-counted garbage collection as you suggested, but does
> a sort of poor man's garbage collection, by classifying entities in
> *only* 3 catergories - not-in-use, in-used-by-one, and in-used
> by-more-than-one.

AFAIK the NAMED field is not used at all by the garbage collector and that 
certainly isn't what it's there for.  The garbage collector is a 
generational mark-and-sweep collector, not reference counted at all.

NAMED is about preserving the "call-by-value illusion" -- an object with 
NAMED=0 or 1 can be modified without copying it -- which seems to be 
exactly the problem in PR#7924.

 	-thomas

> Kevin B. Hendricks wrote:
>> Hi,
>>
>> Okay I threw together a quick dump_object routine and found something
>> that I don't think is correct when call2 is created.
>>
>> > call2 <- Quote(f(arg[[1]]))[c(1,2,2,2)]
>> > get("call2")
>>
>> I use the do_get break to find the SEXP value I want
>>
>> Breakpoint 1, do_get (call=0xc2d530, op=0x52bd30, args=0x9e83a8,
>> rho=Variable "rho" is not available.
>> ) at ../../../r-devel/r-devel/R/src/main/envir.c:1668
>> 1668        if (PRIMVAL(op)) { /* have get(.) */
>>
>>
>> (gdb) print *rval
>> $2 = {sxpinfo = {type = 6, obj = 0, named = 1, gp = 0, mark = 0,
>> debug = 0, trace = 0, fin = 0, gcgen = 0, gccls = 0}, attrib =
>> 0x508818, gengc_next_node = 0x9e7d50,
>>    gengc_prev_node = 0x9e7ce0, u = {primsxp = {offset = 10663048},
>> symsxp = {pname = 0xa2b488, value = 0x9e7ce0, internal = 0x508818},
>> listsxp = {carval = 0xa2b488,
>>        cdrval = 0x9e7ce0, tagval = 0x508818}, envsxp = {frame =
>> 0xa2b488, enclos = 0x9e7ce0, hashtab = 0x508818}, closxp = {formals =
>> 0xa2b488, body = 0x9e7ce0,
>>        env = 0x508818}, promsxp = {value = 0xa2b488, expr = 0x9e7ce0,
>> env = 0x508818}}}
>>
>>
>> Now I invoke my own dump routine which keeps track of recursion level
>> and will dump the named and other things inside the newly created
>> object, the format of the output is
>>
>> recursion level: SEXP X TYPEOF(X) and then some object specific values
>>
>>
>> (gdb) call dump_object(rval, 0)
>>
>>
>> 0: 0x9e7d18 LANGSXP Object with length 1, named 1
>>      f(arg[[1]], arg[[1]], arg[[1]])
>> 1: 0xa2b488 SYMSXP  name at 0xa29408, value at 0x5087e0, named 0
>>      f
>> 1: 0x9e9880 LANGSXP Object with length 1, named 0
>>      arg[[1]]
>> 2: 0x508738 SYMSXP  name at 0x51c788, value at 0x527690, named 0
>>      `[[`
>> 2: 0xc37cc8 SYMSXP  name at 0xc376e8, value at 0x5087e0, named 0
>>      arg
>> 2: 0xf94cb8 REALSXP Object, length 1, starting at 0xf94ce0, named 0
>>      1
>> 1: 0x9e9880 LANGSXP Object with length 1, named 0
>>      arg[[1]]
>> 2: 0x508738 SYMSXP  name at 0x51c788, value at 0x527690, named 0
>>      `[[`
>> 2: 0xc37cc8 SYMSXP  name at 0xc376e8, value at 0x5087e0, named 0
>>      arg
>> 2: 0xf94cb8 REALSXP Object, length 1, starting at 0xf94ce0, named 0
>>      1
>> 1: 0x9e9880 LANGSXP Object with length 1, named 0
>>      arg[[1]]
>> 2: 0x508738 SYMSXP  name at 0x51c788, value at 0x527690, named 0
>>      `[[`
>> 2: 0xc37cc8 SYMSXP  name at 0xc376e8, value at 0x5087e0, named 0
>>      arg
>> 2: 0xf94cb8 REALSXP Object, length 1, starting at 0xf94ce0, named 0
>>      1
>>
>>
>>
>> Notice how each LANGSXP subobject reuses the exact same objects/
>> addresses (notice the address are the same) 3 times (one for each
>> entry) but the named value is always 0 for all of them (even though
>> that address is being re-used (effectively "named") each time.
>>
>> 1: 0x9e9880 LANGSXP Object with length 1, named 0
>>      arg[[1]]
>> 2: 0x508738 SYMSXP  name at 0x51c788, value at 0x527690, named 0
>>      `[[`
>> 2: 0xc37cc8 SYMSXP  name at 0xc376e8, value at 0x5087e0, named 0
>>      arg
>> 2: 0xf94cb8 REALSXP Object, length 1, starting at 0xf94ce0, named 0
>>      1
>>
>>
>> Shouldn't all 3 copies have named set to 1 and not zero since they
>> are all pointing to the same pieces of memory?  And shouldn't that
>> force the top level LANGSXP object to have named of 2 in this case
>> and not its current value of 1.
>>
>>
>> How should any assignment to any of those 3 places in the LANGSXP
>> list ever know they must be duplicated first when all of the named
>> values are 0 even though they all  point to the same block of memory?
>>
>> I truly do not understand how named is being used in this case.  Why
>> don't we simply refcount all allocated objects so we know what the
>> true value of named must be?  How else can we get that information?
>>
>> Hints welcome especially to reading material that explains more on
>> this stuff.
>>
>> Thanks,
>>
>> Kevin
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-devel mailing list