[R] Concepts question: environment, frame, search path

Duncan Murdoch murdoch at stats.uwo.ca
Tue May 1 13:16:26 CEST 2007

On 01/05/2007 12:29 AM, Graham Wideman wrote:
> Folks:
> I'd appreciate if someone could straighten me out on a few concepts which 
> are described a bit ambiguously in the docs.
> 1.  data.frame:
> ----------------
> Refan p84: 'A data frame is a list of variables of the same length with 
> unique row names, given class "data.frame".'
> I probably don't need to point out how opaque that is!

Which manual are you looking at?  The "reference index" (refman.pdf)? It 
doesn't usually include statements like that; they are usually found in 
the Introduction to R (R-intro.pdf) or the R Language Definition 
(R-lang.pdf).  But since the refman is just a collection of man pages, 
it might be in there somewhere.  And since the manuals do get updated, 
that statement may not be present in the current release.  (I did a 
quick search of the source, and couldn't spot it, but my search might 
have failed because of line breaks, strange formatting, or looking in 
the wrong place.)

By the way, it's generally best to cite the section name where you found 
a quote, because the pagination varies from system to system.  Even 
better would be to give a URL to the online HTML version at 

For future reference, if you are suggesting a change, it's best to cite 
the line number in the source at 
https://svn.r-project.org/R/trunk/doc/manual in the *.texi files or 
https://svn.r-project.org/R/trunk/src/library/*/man/*.Rd for man pages, 
and send such suggestions to the R-devel list.

> Anyhow, key question: Some places in the docs seem pretty firm that a 
> data.frame is basically a 2-D array with:
> a) named rows and
> b) columns whose items within a column be of uniform data type.
> Elsewhere, it seems like a data.frame can be a collection of arbitrary 
> variables.

The former interpretation is correct.  Since the variables all have the 
same length, things like df[i, j] make sense:  they choose the i'th 
entry from the j'th variable (according to the "refan" definition), or 
the i'th row, j'th column (according to the 2-D array interpretation.
> 2. environment
> ---------------
> Refman p122:  "Environments consist of a frame, or collection of named 
> objects, and a pointer to an enclosing environment."
> Is the "or" here explaining parenthetically that a frame is a collection of 
> named objects, or is separating too alternative structures for an 
> environment?

The former.
> If the former, does this imply that a frame can contain arbitrary variables?

Yes, but a frame isn't an R object, it's a concept that appears in 
descriptions, e.g. part of an environment, or the local variables 
created during function evaluation, etc.
> And "pointer"? Is that a type of thing in R?

No, there are no pointers in R.  There are a couple of tricks to fake 
them (e.g. environment objects aren't copied when assigned, you just get 
a new reference to the same environment; this allows you to construct 
something like a pointer by wrapping an object in an environment), but I 
don't recommend using these routinely.

> 3.  R search path; attach()
> ----------------------------
> The R search path appears to hold the list of "collections of data" (my 
> term) that can be accessed by a users' commands. Refman p27 tells that 
> search path can hold items that are data.frame, list, environment or R data 
> file (on disk).  Yet R-intro p28 describes attach() as taking a "directory 
> name" argument.  What is the concept "directory" in this context?

I haven't read the preceding pages carefully, but that looks like an 
error.  The usual argument to attach is a package name, and what gets 
attached is an environment holding the exports from the package. 
Packages are stored in directories in the file system, so maybe that's 
what the author of that line had in mind.

Duncan Murdoch

More information about the R-help mailing list