[Rd] R-intro suggestions part II (PR#1011)

Tue, 3 Jul 2001 22:55:22 +0200 (MET DST)

Brian Ripley said that, if I sent these here, they might find their way
to the R-intro team.  So, here is another installment:

Concerning this section:

Data permanency and removing objects.

This is the first place R-intro discusses objects.  I was thinking a
better discussion of objects and syntax is needed, perhaps here, or in
the later section "Objects, their modes and attributes."  I suggest
this:

The fact that R is an object-oriented approach to programming for
statistical analysis has pervasive implications.  If one has studied a
programming language like Java or C++, the notions of object and
method are not new.  For readers who have not studied languages like
that, perhaps we can offer a brief explanation.  An object is a "self
contained data holder" that can follow instructions (do calculations,
provide values).  One intention of this design is to keep things
separate if they belong that way, to reduce the risk that one
calculation accidentally influences the result of another.
Object-orientation also has important implications for the way the
different parts of a program fit together.  Since an object is
supposed to be able to carry out a certain set of actions (which are
called methods, but in R one often says "functions" or "procedures"),
then both the user and the programmer understand what the object can
be expected to do.  There are numerous implications of object
orientation that come into play "under the hood" of the R statistical
engine, but it is not important to delve into them from a user's point
of view.

There is one wrinkle about the syntax of R that we hasten to
emphasize.  In object-oriented computer languages, syntax typically
puts the object first, followed by the instruction, followed any needed
parameters.  In Java, to tell the object "regression" to calculate
estimates for vectors y and x, one would write something like:

   regression.estimate(y,x);

Assuming that object's class has an "estimate" method which knows how
to handle the input, we would be in business.

In R, the syntax is totally different.  Instead of thinking of the
object itself as the primary thing, the syntax in R is designed to use
the method name--the instruction--as the leading concept, and then the
object that is being acted upon is given as a parameter.

When a model gives a result in R, the result is almost invariably an
object, which can then be "poked and prodded" with other methods.
If a linear regression is calculated, for example:

resultObject <- lm(y~x)

the return value is an object.  But, unlike other object-oriented
languages, in R one would write summary(resultObject), or
coefficients(resultObject), rather than the Java style
result_object.summary() or some such thing, in order to investigate the
result.

Paul E. Johnson
Political Science
University of Kansas

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._