[R] Re: re| R-list| internal-fields

Mon Nov 25 20:09:22 CET 2002

I was asked to expand on the point below

  >You shouldn't use fm1$residuals --- the S language doesn't prevent you
  >from accessing internal fields of objects directly, but it's still a bad
  >idea, especially if you don't know what they mean.

So here's a brief lecture:

One of the main purposes of objects in programming is to hide
implementation details from users.  This has two benefits: you can change
the implementation without breaking other people's code, and you can have
the same interface for a wide variety of objects with very different
internal implementations. In morerigorously object-oriented languages
there is no access to the internals of an object --- you can only use the
supplied accessor functions.

In S, things are less dictatorial.  You *can* use model$residuals to
access the residuals component of a glm directly, or you can use the accessor
function resid(model).  You still *should* use the accessor function where
possible.

One reason is that the internal structure is often not documented and
is subject to change. As it happens, the $residuals component of a glm is
documented and probably reasonably stable, but the internal workings of eg
a coxph object in the survival package are much less stable from version
to version and are not documented.

In addition, the accessor functions are similar across different models,
so it's easier to learn R if you use them.  The command resid(model)
returns some sensible type of residual for all regression models, and
where appropriate it has a type="" option to specify other types of
residuals.  It's a lot harder to keep track of which component of the
object contains the residuals -- and there's no guarantee that there will
be one.

The glm object provides a nice example.  The $residuals component contains
the working residuals, which are probably not the ones you want.  You are
more likely to want the deviance or Pearson residuals, which are only
available through the resid() function.

Another good example is the vcov() function that returns the covariance
matrix of the coefficient estimates.  This is typically not a component of
the model object (except sometimes). It's often a component of the value
returned by summary(), but even then it has different names for different
models.

	-thomas

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._