[R] S4 Classes, nested objects and references

Martin Morgan mtmorgan at fhcrc.org
Thu Dec 3 21:33:30 CET 2009


Joris Meys wrote:
> Hi all,
> 
> I'm currently programming my first complete package in S4. (thanks to
> Christophe Genolini for the nice introduction he wrote). I have an
> object "Data" with a number of slots. One of those slots is "meteo".
> Now "Meteo" is on itself a class with again a number of slots (like
> rainfall, temperature,..., you get the picture).
> 
> I defined the slot "meteo" currently as a character slot, and the
> values refer to the names of the Meteo-objects related to that
> Data-object. The cleaner way would be to define the slot "meteo" as a
> slot of class "Meteo", but I'm not sure how that works internally.
> 
> Thing is, I have multiple Data objects that refer to the same Meteo
> object. I am a bit afraid that when I define the slot meteo as a slot
> of the corresponding class, each Data object will contain a complete
> copy of the Meteo object it relates to. This would mean that in the
> memory I will end up with multiple copies of exactly the same data.
> Although it is cleaner, it is definitely not more efficient.
> 
> My question : Am I wrong in my assumption that I will have multiple
> copies in the memories? If yes, is it possible to use references in a
> more formal way than I do now? Or is there an obvious solution I am
> missing here?

Hi Joris -- you're second-guessing R's memory management; it could be
that the data are physically replicated, but that may not necessarily be
so. The first thing to do is the obvious, define the slot to contain an
object of class Mateo.

If memory management really is an issue, then round two might define
Mateo to contain a slot that is an environment, in which the big data is
stored.

setClass("Mateo", representation=representation(bigData="environment"))

bigData = new.env(parent=emptyenv())
bigData[["myData"]] = <...>
m = new("Mateo", bigData=bigData)

This really changes the semantics of objects, so you'll want to protect
your end users from unintended consequences, e.g., after n = m, changing
m at bigData[["myData"]] would also change n. You might use lockEnvironment
in an initialize method to make sure that bigData is really read-only,
or provide accessors that copy bigData when the user wants to make a
change. It is also important to realize that setClass defines a
prototype, the prototype contains an environment, and unless you take
care then all instances derived from the prototype (e.g., calling new()
without a bigData argument) will share the same environment. Probably
not what you want. This extra work really reflects the change in
semantics implied by references; it is only indirectly related to S4.

Martin

> 
> Thank you in advance
> Cheers
> Joris
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793




More information about the R-help mailing list