[R] finding global variables in a function containing formulae

William Dunlap wdunlap at tibco.com
Sun Nov 4 01:18:31 CET 2012


> -----Original Message-----
> From: William Dunlap
> Sent: Saturday, November 03, 2012 11:23 AM
> To: 'Hafen, Ryan P'; Bert Gunter
> Cc: r-help at r-project.org
> Subject: RE: [R] finding global variables in a function containing formulae
> 
> findGlobals must be explicitly ignoring calls to the ~ function.
> You could poke through the source code of codetools and find
> where this is happening.

I looked through some old notes and found you could
disable the special handler for "~" by removing it from
the environment codetools:::collectUsageHandlers:
  > findGlobals(function(y)lm(y~x)) # doesn't note 'x' as a global reference
  [1] "~"  "lm"
  > tildeHandler <- codetools:::collectUsageHandlers[["~"]]
  > remove("~", envir=codetools:::collectUsageHandlers)
  > findGlobals(function(y)lm(y~x)) # notes 'x'
  [1] "~"  "lm" "x"
  > # reinstall "~" handler to get original behavior
  > # or detach("package:codetools", unload=TRUE) and reattach
  > assign("~", tildeHandler, envir=codetools:::collectUsageHandlers)
  > findGlobals(function(y)lm(y~x)) # does not note 'x'
  [1] "~"  "lm"

You still have the false alarm problem.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> 
> Or, if you have the source code for the package you are investigating,
> use sed to change all "~" to "%TILDE%" and then use findGlobals on
> the resulting source code.  The messages will be a bit garbled but
> should give you a start.  E.g., compare the following two, in which y
> is defined in the function but x is not:
>    > findGlobals(function(y)lm(y~x))
>   [1] "~"  "lm"
>   > findGlobals(function(y)lm(y %TILDE% x))
>   [1] "lm"      "%TILDE%" "x"
> 
> You will get false alarms, since in a call like lm(y~x+z, data=dat) findGlobals
> cannot know if dat includes columns called 'x', 'y', and 'z' and the above
> approach errs on the side of reporting the potential problem.
> 
> You could use code in codetools to analyze S code instead of source code
> to globally replace all calls to "~" with calls to "%TILDE%" but that is more
> work than using sed on the source code.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> > Of Hafen, Ryan P
> > Sent: Friday, November 02, 2012 4:28 PM
> > To: Bert Gunter
> > Cc: r-help at r-project.org
> > Subject: Re: [R] finding global variables in a function containing formulae
> >
> > Thanks.  That works if I a have the formula expression handy.  But suppose
> > I want a function, findGlobalVars() that takes a function as an argument
> > and finds globals in it, where I have absolutely no idea what is in the
> > supplied function:
> >
> > findGlobalVars <- function(f) {
> >    require(codetools)
> >    findGlobals(f, merge=FALSE)$variables
> > }
> >
> >
> > findGlobalVars(plotFn1)
> >
> > I would like findGlobalVars() to be able to find variables in formulae
> > that might be present in f.
> >
> >
> >
> >
> > On 11/1/12 1:19 PM, "Bert Gunter" <gunter.berton at gene.com> wrote:
> >
> > >Does
> > >
> > >?all.vars
> > >##as in
> > >> all.vars(y~x)
> > >[1] "y" "x"
> > >
> > >help?
> > >
> > >-- Bert
> > >
> > >On Thu, Nov 1, 2012 at 11:04 AM, Hafen, Ryan P <Ryan.Hafen at pnnl.gov>
> > >wrote:
> > >> I need to find all global variables being used in a function and
> > >>findGlobals() in the codetools package works quite nicely.  However, I
> > >>am not able to find variables that are used in formulae.  Simply
> > >>avoiding formulae in functions is not an option because I do not have
> > >>control over what functions this will be applied to.
> > >>
> > >> Here is an example to illustrate:
> > >>
> > >> library(codetools)
> > >>
> > >> xGlobal <- rnorm(10)
> > >> yGlobal <- rnorm(10)
> > >>
> > >> plotFn1 <- function() {
> > >>    plot(yGlobal ~ xGlobal)
> > >> }
> > >>
> > >> plotFn2 <- function() {
> > >>    y <- yGlobal
> > >>    x <- xGlobal
> > >>    plot(y ~ x)
> > >> }
> > >>
> > >> plotFn3 <- function() {
> > >>    plot(xGlobal, yGlobal)
> > >> }
> > >>
> > >> findGlobals(plotFn1, merge=FALSE)$variables
> > >> # character(0)
> > >> findGlobals(plotFn2, merge=FALSE)$variables
> > >> # [1] "xGlobal" "yGlobal"
> > >> findGlobals(plotFn3, merge=FALSE)$variables
> > >> # [1] "xGlobal" "yGlobal"
> > >>
> > >> I would like to find that plotFn1 also uses globals xGlobal and
> > >>yGlobal.  Any suggestions on how I might do this?
> > >>
> > >> ______________________________________________
> > >> R-help at r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >>http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >
> > >
> > >
> > >--
> > >
> > >Bert Gunter
> > >Genentech Nonclinical Biostatistics
> > >
> > >Internal Contact Info:
> > >Phone: 467-7374
> > >Website:
> > >http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-bio
> > >statistics/pdb-ncb-home.htm
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list