[R] a question of substitute

Adrian Dusa dusa.adrian at gmail.com
Thu Jan 11 09:39:33 CET 2007

Dear Prof. Ripley,

Thank you for this extensive explanation. It looks like my first solution is 
similar to (b): creating new variables inside the wrapper (and new data if 
not missing).
This course is only introductory, with simple models, and I do point students 
to each test separately if they want more complicated things.

I'm looking forward to the release of the 2.5.0 version.
Best regards,

On Thursday 11 January 2007 03:08, Prof Brian Ripley wrote:
> The 'Right Thing' is for oneway.test() to allow a variable for the first
> argument, and I have altered it in R-patched and R-devel to do so. So if
> your students can make use of R-patched that would be the best solution.
> If not, perhaps you could make a copy of oneway.test from R-patched
> available to them.  Normally I would worry about namespace issues, but it
> seems unlikely they would matter here: if they did assignInNamespace is
> likely to work to insert the fix.
> Grothendieck's suggestions are steps towards a morass: they may work in
> simple cases but can make more complicated ones worse (such as looking for
> 'data' in the wrong place).  These model fitting functions have rather
> precise requirements for where they look for their components:
>  	'data'
>  	the environment of 'formula'
>  	the environment of the caller
> and that includes where they look for 'data'.  It is easy to use
> substitute or such to make a literal formula out of 'formula', but doing
> so changes its environment.  So one needs to either
> (a) fix up an environment within which to evaluate the modified call that
> emulates the scoping rules or
> (b) create a new 'data' that has references to all the variables needed,
> and just call the function with the new 'formula' and new 'data'.
> At first sight model.frame() looks the way to do (b), but it is not, since
> if there are function calls in the formula (e.g. log()) the model frame
> includes the derived variables and not the original ones.  There are
> workarounds (e.g. in glmmPQL), like using all.vars, creating a formula
> from that, setting its environment to that of the original function and
> then calling model.frame.
> This comes up often enough that I have contemplated adding a solution to
> (b) to the stats package.
> Doing either of these right is really pretty complicated, and not
> something to dash off code in a fairly quick reply (or even to check that
> the code in glmmPQL was general enough to be applicable).

Adrian Dusa
Romanian Social Data Archive
1, Schitu Magureanu Bd
050025 Bucharest sector 5
Tel./Fax: +40 21 3126618 \
          +40 21 3120210 / int.101

More information about the R-help mailing list