[Rd] Objects in R

Nathan Whitehouse nlwhitehouse at yahoo.com
Thu Apr 21 18:37:30 CEST 2005


Hi,
  A few comments from a fairly experienced R user who
worked for several years on a R-based bioinformatics
analysis framework.

  I don't want to misrepresent anyone's views, but...

  There are real disadvantages to the
"objects-as-C-structs" and functions/methods which
"mutate" based on argument type. i.e. S4.

  (1)Novices simply don't understand it.  Students are
trained in "standard" object-oriented technique and
this wonkish offshoot(puritanical functional
programming) just increases the information costs to
using R and thus decreases the demand.

  (2)Large frameworks benefit from
serializable/storable objects which contain both
functionality and modifiable values.  S4 stores
"class" information and R.oo does not upon
"save()"ing, but there are still real hindrances to
"trading" objects, which is -extraordinarily-
important in creating industrial-variety R-based
analysis.
  The classical example in my mind is the difficulties
in implementing a "visitor" pattern in S4.  

  (3)The absence of references means for large
datasets and long "analysis flows," there is (1)a
hideous amount of memory used storing each predecessor
analysis or (2)there are awkward "references" that
I've seen used like storing the name of the reference
object in a data slot.
  I find the use of environments in R.oo as opposed to
the glorified LISTSXP of S4 to be a satisfying way
around this.

  S4 is a nice step forward.  But R should be open to
further evolution.  The design choices for S4 and the
reasons behind abandoning OOP have never been
adequately justified in my knowledge.  Instead most
inquiries have been met by a Sphinx-like silence by
the core community.

  But the hindrances faced by our friend Ali are
common, and even in packages maintained by experienced
R developers, S4 is implemented shall we say curiously
as per the specs.
  Clearly OOP and R.oo are not the final answer.  But
some serious discussion about why packages like R.oo
which "layer" onto the standard functional R are
inappropriate is in order.

  It would be great to see R emerge from its niche
audience.  I believe that would aid statisticians and
programmers.  However, a little bit more transparency
and something beyond a categorical "we just don't like
that way of doing things" would go a long way towards
growing the base community of R.
 
  Cheers,
  Nathan Whitehouse
  Formerly of Baylor College of Medicine.

Ali, maybe we R-core members are not decent enough.
But we strongly believe that we don't want to advocate
yet
another object system additionally to the S3 and S4
one,
and several of us have given talks and classes, even
written
books on how to do "decent" object oriented
programming 
`just' with the S3 and/or S4 object system.

No need of additional "oo" in our eyes.
Your main problem is that you assume what "oo" means
{which may
well be true} but *additionally* you also assume that
OO has to
be done in the same way you know it from Python, C++,
or Java..

Since you are new, please try to learn the S4 way,
where methods belong to (generic) functions more than
to classes in some way, particularly if you compare
with other
OO systems where methods belong entirely to classes.
This is NOT true for R (and S-plus) and we don't want
this to
change {and yes, we do know about C++, Python,
Java,... and
their way to do OO}.

Please also read in more details the good advice given
by Tony
Plate and Sean Davis.

Martin Maechler,
ETH Zurich



Nathan Whitehouse
nlwhitehouse at yahoo.com



More information about the R-devel mailing list