[R] S4 vs Reference Classes

Joseph Park jpark.us at att.net
Wed Sep 14 21:56:58 CEST 2011


   Gentlemen: Steve, Martin & Doug:
   Thanks for the insightful comments regarding my query.
   I think that Martin and Doug have well assessed my position
   and both offer useful advice and have greatly improved
   my limited understanding of S4. Thanks!
   At this point, i'm well into the app via S4, and so will
   probably continue on. If the app finds wings, then i'll
   convert it to Reference Classes.
   Generally, my problem with S4 in an OO paradigm is that i
   need to add (what i consider) extra code in the main app
   environment to update object slots. As Doug points out:
   "If you try to perform some kind of
   update operation on an S4 object and not cheat in some way (i.e.
   adhere to strict functional programming semantics) you need to create
   a new instance of the object each time you update it."
   which is my issue. Without the reference-based approach an object
   in a slot which is then included in another object slot is a copy.
   An update to the original object slot then requires 'extra' code
   to update/synchronize the copy.
   This is not a complaint! I find R quite amazing and powerful.
   Next time I'll dive into the Reference Class methods, or perhaps
   as suggested, hybridize the current app.
   On 9/14/2011 12:02 PM, Martin Morgan wrote:

     On 09/14/2011 06:01 AM, Joseph Park wrote:

     Thanks Martin.
     What i'm hoping to do is have a class object, with a member method
     that can change values of slots in the object, without having to
     assign values by external assignment to the object. Something like this:
     setClass ( "Element",
     representation ( x = "numeric", y = "numeric" ),
     prototype = list( x = 0, y = 1 )
     )
     setGeneric( name = "ComputeX",
     def = function( self ) standardGeneric("ComputeX") )
     setMethod( "ComputeX", signature = "Element",
     function ( self ) {
     if ( self @ y > 0 ) {
     self @ x = pi
     }
     }
     )
     so that a call to the method ComputeX assigns ('internally') a
     value to the slot x of the global object.

     Hi Joseph --
     I understand. In R generally and in S4 in particular self at x = pi triggers
     a 'copy-on-change', so self inside the function is now different from self
     outside the function.
     You either need to change your expectations, or use reference classes (and
     change the expectations of your users).
     For completeness, in your function above you would return self, and have
     elt = ComputeX(elt)
     you'd also likely implement some 'accessor' X (or better named) so
     X(elt)
     to get X. So there is no direct call to @ in your code.
     It might help to understand a real use case; if it's just 'that's the way
     other programming languages do it' then there isn't much more to discuss.
     But  maybe, like Doug Bates, you have a particular problem with the
     paradigm?

     One can do :
     a = new( 'Element' )
     a @ x = 2
     but i would prefer to have a class method do the work without
     having to explicitly call a @ x = 2. Having to do this means that
     i need code in my main processing app that does things on slots
     that normally i would do in a class method.
     As I understand it, Reference Classes provide this. So i'm
     naturally wondering if i should switch my app from S4 to RC.
     Fundamentally, I don't clearly understand S4 and what the difference
     is between creating a SetReplaceMethod vs a SetMethod, since it
     seems that in either case one has to 'externally' assign the slot
     value. My limitation, of course.

     at some level they are differences in syntax only, e.g.,
     slt(a) = 2
     versus
     setGeneric("updt", function(x, value, ...) standardGeneric("updt"))
     setMethod(updt, c("A", "numeric"), function(x, value, ...) {
         initialize(x, a=value)
     })
     and then
     a = updt(a, 3)
     The 'updt' model easily extends to multiple arguments; both represent an
     abstraction between the API seen by the user, and the implementation of
     the class, so there's no reason to store '3' directly.
     Martin

     On 9/14/2011 12:17 AM, Martin Morgan wrote:

     On 09/13/2011 10:54 AM, Joseph Park wrote:

     Hi, I'm looking for some guidance on whether to use
     S4 or Reference Classes for an analysis application
     I'm developing.
     I'm a C++/Python developer, and like to 'think' in OOD.
     I started my app with S4, thinking that was the best
     set of OO features in R. However, it appears that one
     needs Reference Classes to allow object methods to assign
     values (other than the .Object in the initialize method)
     to slots of the object.

     With
     setClass("A", representation=representation(slt="numeric"))
     a slot can be updated with @<- and an object updated with a
     replacement method
     setGeneric("slt<-", function(x, ..., value) standardGeneric("slt<-"))
     setReplaceMethod("slt", c("A", "numeric"), function(x, ..., value) {
     x at slt <- value
     x
     })
     so
     > a = new("A", slt=1)
     > slt(a) = 2
     > a
     An object of class "A"
     Slot "slt":
     [1] 2
     The default initialize method also works as a copy constructor with
     validity check, e.g., allowing multiple slot updates
     setReplaceMethod("slt", c("A", "ANY"), function(x, ..., value) {
     initialize(x, slt=as.numeric(value))
     })
     > slt(a) = "1"

     This is typically what I prefer: creating an object, then
     operating on the object (reference) calling object methods
     to access/modify slots.
     So I'm wondering what (dis)advantages there are in
     developing with S4 vs Reference Classes.

     R's copy-on-change semantics leads me to expect that
     b = a
     slt(a) = 2
     leaves b unchanged, which S4 does (necessarily copying and thus with a
     time and memory performance cost). A reference class might be
     appropriate when the entity referred to exists in a single copy, as
     e.g., an on-disk data base, or an external pointer to a C++ class.
     Martin

     Things of interest:
     Performance (i.e. memory management)
     Integration compatibility with R packages
     ??? other issues
     Thanks!
     ______________________________________________
     [1]R-help at r-project.org mailing list
     [2]https://stat.ethz.ch/mailman/listinfo/r-help
     PLEASE do read the posting guide
     [3]http://www.R-project.org/posting-guide.html
     and provide commented, minimal, self-contained, reproducible code.

References

   1. mailto:R-help at r-project.org
   2. https://stat.ethz.ch/mailman/listinfo/r-help
   3. http://www.R-project.org/posting-guide.html


More information about the R-help mailing list