[Rd] as.environment.list provides inconsistent results under torture

luke-tierney at uiowa.edu luke-tierney at uiowa.edu
Wed Jan 12 14:18:19 CET 2011


On Tue, 11 Jan 2011, Simon Urbanek wrote:

>
> On Jan 11, 2011, at 6:55 PM, <luke-tierney at uiowa.edu> <luke-tierney at uiowa.edu> wrote:
>
>> No. Lots of internal functions expect their callers to protect their arguments, for efficiency reasons. eval is called very often and almost always with argument that are protected because they are in the evaluation engine, so it would be wasteful and potentially very costly if eval protected its arguments every time it is called. (I don't tknow what the cost would be to do so in the current implementation but it could be prohibitive if we moved to some different approaches, so for now we whould continue to expect callers of eval to make sure the argumetns are protected.)
>>
>
>
> Fair enough. It would be nice if this was explicitly documented since eval() is part of the API and I see several packages on CRAN using eval(LCONS(..),..) and eval(listX(...),...) - and I don't blame them (partly because one of them is mine ;)).

There are lots of functions in the internals, API or not, that do not
protect their arguments. I would argue that the right way to think
about this is: unless a function is explicitly documented to protect
its arguments you have to assume it either does not now or might not
in the future.

In any case, with functions of more than one SEXP argument, like eval,
protecting the arguments might give inexperienced programmers a false
sense of security: if f, g, and h all allocate and f protects its
arguments they might then be tempted to write

      f(g(), h())

But that would be wrong since one of the values of g() or h() would be
unprotected while the other one is computed and before f() has a
chance to do its protecting (which one is unprotected depends on the
argument evaluation order the C compiler chooses for the expression).

One of the few lower level functions that protects its arguments is
CONS (and some of its cousins like LCONS) both because it is cheap to
do so because of the integration with the GC and because it makes
certain kinds of code a bit more readable. But again if you want to
take advantage of this only one of the arguments can allocate. For
example, in the first case below there is

     LCONS(install("library"),CONS(install("grDevices"),R_NilValue))

Both LCONS and CONS protect their arguments, but one of the two
arguments to LCONS is unprotected while the other is being
computed. (Of course the install calls only allocate if the symbols
are not in the symbol table and even then those allocations are
protected by being put in the symbol table, and as library will be
there once base is loaded this is safe for all practical purposes, but
it would still be better to write it as

     lang2(install("library"), install("grDevices"))

or

      SEXP librarySymbol = install("library");
      SEXP grDevicesSymbol = install("grDevices");
      ...
      lang2(librarySymbol, grDevicesSymbol)

or something like that.)

I suppose we could have more warnings about this sort of thing in the
extensions manual.

I did do a quick scan of R-devel for this issue with eval and found these:

     ./unix/aqua.c:	     eval(LCONS(install("library"),CONS(install("grDevices"),R_NilValue)),R_GlobalEnv);
     ./unix/sys-std.c:    infile = PROTECT(eval(lang1(RComp_getFileCompSym), rcompgen_rho));
     ./modules/X11/dataentry.c:       newval <- eval(parse(text=newval))
     ./main/envir.c:			 return(eval(lang4(install("list2env"), arg,
     ./gnuwin32/dataentry.c:       newval <- eval(parse(text=newval))

I'll fix them in the next couple of days if no one else gets there
first (but I'm not set up to test the aqua or gnuwin32 ones).

Best,

luke

> Unfortunately all the examples in R-ext use implicitly protected arguments (as function arguments or parts of larger already protected constructs) so it's not obvious from that, either.
>
> Thanks,
> Simon
>
>
>
>
>>
>>
>> On Tue, 11 Jan 2011, Simon Urbanek wrote:
>>
>>> Interesting, I'd argue that the bug is in eval() not protecting its arguments since the usual convention is for functions to protect its arguments...
>>>
>>> On Jan 11, 2011, at 2:33 PM, Romain Francois wrote:
>>>
>>>> Hello,
>>>>
>>>> Using R-devel (rev 53950), I get inconsistent results with as.environment( VECSXP ) when gctorture is on.
>>>>
>>>> Consider:
>>>>
>>>> a <- list( aa = rnorm, bb = runif )
>>>> gctorture(TRUE)
>>>> as.environment( a )
>>>>
>>>> The last line sometimes produces the correct environment, but sometimes I get errors. Here are the three situations:
>>>>
>>>> # good
>>>>> as.environment( a )
>>>> <environment: 0x100b1c978>
>>>>
>>>> # not good
>>>>> as.environment( a )
>>>> Erreur dans length(x) : 'x' est manquant
>>>>
>>>> # not good either
>>>>> as.environment( a )
>>>> Erreur dans list(NULL, list(aa = function (n, mean = 0, sd = 1)  :
>>>> correspondance partielle de chaînes de caractères incorrecte
>>>>
>>>>
>>>> Is it because the call made by lang4 is not protected while evaluated in this line :
>>>>
>>>>   case VECSXP: {
>>>> 	/* implement as.environment.list() {isObject(.) is false for a list} */
>>>> 	return(eval(lang4(install("list2env"), arg,
>>>> 			  /*envir = */R_NilValue, /* parent = */R_EmptyEnv),
>>>> 		    rho));
>>>>   }
>>>>
>>>>
>>>> (BTW, this was detected in a looooooooong Rcpp-devel thread. See http://comments.gmane.org/gmane.comp.lang.r.rcpp/1336)
>>>>
>>>> Romain
>>>>
>>>> --
>>>> Romain Francois
>>>> Professional R Enthusiast
>>>> +33(0) 6 28 91 30 30
>>>> http://romainfrancois.blog.free.fr
>>>> |- http://bit.ly/fT2rZM : highlight 0.2-5
>>>> |- http://bit.ly/gpCSpH : Evolution of Rcpp code size
>>>> `- http://bit.ly/hovakS : RcppGSL initial release
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> --
>> Luke Tierney
>> Statistics and Actuarial Science
>> Ralph E. Wareham Professor of Mathematical Sciences
>> University of Iowa                  Phone:             319-335-3386
>> Department of Statistics and        Fax:               319-335-3017
>>   Actuarial Science
>> 241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
>> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>
>

-- 
Luke Tierney
Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-devel mailing list