[Rd] environment question

Mon Dec 27 12:24:42 CET 2010

On 10-12-26 4:30 PM, Paul Johnson wrote:
 > Hello, everybody.
 >
 > I'm putting together some lecture notes and course exercises on R
 > programming.  My plan is to pick some R packages, ask students to read
 > through code and see why things work, maybe make some changes.  As I
 > look for examples, I'm running up against the problem that packages
 > use coding idioms that are unfamiliar to me.
 >
 > A difficult thing for me is explaining scope of variables in R
 > functions.  When should we pass an object to a function, when should
 > we let the R system search about for an object?  I've been puzzling
 > through ?environment for quite a while.

Take a look at the Language Definition, not just the ?environment page.

 >
 > Here's an example from one of the packages that I like, called "ltm".
 > In the function "ltm.fit" the work of calculating estimates is sent to
 > different functions like "EM' and "loglikltm" and "scoreltm".  Before
 > that, this is used:
 >
 > environment(EM)<- environment(loglikltm)<- environment(scoreltm)<-
 > environment()
 >
 > ##and then EM is called
 > res.EM<- EM(betas, constraint, control$iter.em, control$verbose)
 >
 > I want to make sure I understand this. The environment line gets the
 > current environment and then assigns it for those 3 functions, right?
 > All variables and functions that can be accessed from the current
 > position in the code become available to function EM, loglikltm,
 > scoreltm.

That's one way to think of it, but it is slightly more accurate to say 
that three new functions are created, whose associated environments are 
set to the current environment.

 >
 > So, which options should be explicitly inserted into a function call,
 > which should be left in the environment for R to find when it needs
 > them?

That's a matter of style.  I would say that it is usually better style 
not to mess around with a function's environment.

 >
 > 1. I *think* that when EM is called, the variables "betas",
 > "constraint", and "control" are already in the environment.

That need not be true, as long as they are in the environment by the 
time EM, loglikltm, scoreltm are called.

 >
 > The EM function is declared like this, using the same words "beta" and
 > "constraint"
 >
 > EM<-
 > function (betas, constraint, iter, verbose = FALSE) {
 >
 > It seems to me that if I wrote the function call like this (leave out
 > "betas" and "constraint")
 >
 > res.EM<- EM(control$iter.em, control$verbose)
 >
 > R will run EM and go find "betas" and "constraint" in the environment,
 > there was no need to name them as arguments.

Including them as arguments means that new local copies will be created 
in the evaluation frame.

 >
 >
 > 2 Is a function like EM allowed to alter objects that it finds through
 > the environment, ones that are not passed as arguments? I understand
 > that a function cannot alter an object that is passed explicitly, but
 > what about the ones it grabs from the environment?

Yes it's allowed, but the usual rules of assignment won't do it.  Read 
about the <<- operator for modifying things that are not local.  In summary:

  beta <- 1

creates or modifies a new local variable, while

  beta <<- 1

goes looking for beta, and modifies the first one it finds.  If it fails 
to find one, it creates one in the global environment.

Duncan Murdoch

 > If you have ideas about packages that might be handy teaching
 > examples, please let me know.
 >
 > pj