[R] Google's R Style Guide

Henrik Bengtsson hb at stat.berkeley.edu
Tue Sep 1 17:35:26 CEST 2009


On Tue, Sep 1, 2009 at 6:29 AM, Duncan Murdoch<murdoch at stats.uwo.ca> wrote:
> On 9/1/2009 8:58 AM, Martin Morgan wrote:
>>
>> Corrado wrote:
>>>
>>> Thanks Duncan, Spencer,
>>>
>>> To clarify, the situation is:
>>>
>>> 1) I have no reasons to choose S3 on S4 or vice versa, or any other
>>> coding convention
>>> 2) Our group has not done any OO developing in R and I would be the
>>> first, so I can set up the standards
>>> 3) I am starting from scratch with a new package, so I do not have any
>>> code I need to re-use.
>>
>> One consideration might be the domain in which you are doing
>> development; Bioconductor for instance makes extensive use of S4 and
>> your efforts at learning to develop S4 would pay off both in your own
>> code and in understanding other packages.
>>
>>> 4) I am an R OO newbie, so whatever I can learn from the beginning what
>>> is better and good for me.
>>>
>>> So the questions would be two:
>>>
>>> 1) What coding style guide should we / I follow? Is the google style
>>> guide good, or is there something better / more prescriptive which makes our
>>> research group life easier? 2) What class type should I use? From what you
>>> two say, I should use S3 because is easier to use .... what are the
>>> disadvantages? Is there an advantages / disadvantages table for S3 and S4
>>> classes?
>>
>> It seems relevant to compare S3 and S4 code for doing S3-style
>> programming, leaving more 'advanced' S4 for another day. In S3 I might
>> define a simple class and method as
>>
>>
>> makeS3Foo <-
>>    function(x=numeric(), y=numeric())
>> {
>>    if (class(x) != "numeric")
>>        stop("'x' must be numeric")
>>    if (class(y) != "numeric")
>>        stop("'y' must be numeric")
>>    l <- list(x=x, y=y)
>>    class(l) <- "S3Foo"
>>    l
>> }
>>
>> doS3 <- function(x, ...) NextMethod("doS3")
>>
>> doS3.default <- function(x, ...) "doS3 default"
>>
>> doS3.S3Foo <- function(x, ...) "doS3 of S3Foo"
>>
>>
>> with an example of use being
>>
>>> doS3(makeS3Foo())
>>
>> [1] "doS3 of S3Foo"
>
>
> That looks as though it was written by an S4 user.  I would write it this
> way (with some differences in behaviour):
>
> S3Foo <- function(x=numeric(), y=numeric()) {
>  structure(list(x=as.numeric(x), y=as.numeric(y)), class="S3Foo")
> }
>
> The rest of my code would be pretty similar to yours, though I think it
> should use UseMethod("doS3") rather than NextMethod("doS3").
>
> Duncan Murdoch
>
>>
>>
>> I use 'makeS3Foo' as a constructor, so that whenever I make an instance
>> of what I'm calling class S3Foo, I have some guarantees about its
>> structure.
>>
>> The S4 implementation might be
>>
>>
>> setClass("S4Foo", representation(x="numeric", y="numeric"))
>>
>> makeS4Foo <-
>>    function(x = numeric(), y=numeric(), ...)
>> {
>>    new("S4Foo", x=x, y=y, ...)
>> }

In S3 using R.methodsS3/R.oo utility functions:

library("R.oo");

setConstructorS3("S3Foo", function(x="numeric", y="numeric") {
  # Validate arguments
  stopifnot(inherits(x, "numeric"));
  stopifnot(inherits(y, "numeric"));

  extend(list(x=x, y=y), "S3Foo");  # similar to structure().
})

Naming guideline: Name the *constructor* function the same as your
class; this will make things consistent and help you and the user.

Note, you don't want to use class(x) != "numeric", because class(x)
may return a vector; always use inherits().

Then, to setup methods for this class, do:

setMethodS3("print", "S3Foo", function(object, ...) {
  ...
})

setMethodS3("plot", "S3Foo", function(object, ...) {
  ...
})

Generic functions are created automagically when missing; no need for
you to specify that explicitly.

Dispatching in S3 is on the first argument only.  FYI, it is extremely
rare that you want to dispatch on other arguments although you hear
that in the context of S4; I only know of one use case with formulas.

One purpose of R.methodsS3/R.oo is to ease any transitions to S4.

/Henrik

>>
>> setGeneric("doS4", function(x, ...) standardGeneric("doS4"),
>>           useAsDefault=function(x, ...) "do default")
>>
>> setMethod("doS4", "S4Foo", function(x, ...) "doS4 of S4Foo")
>>
>> and use with
>>
>>> doS4(makeS4Foo())
>>
>> [1] "doS4 of S4Foo"
>>
>> It seems like the translation between the two is really quite
>> transparent, and equally arcane to someone new to R.
>>
>> Some things I get from S4 are a level of automatic type checking
>>
>>> makeS4Foo(x="bar")
>>
>> Error in validObject(.Object) :
>>  invalid class "S4Foo" object: invalid object for slot "x" in class
>> "S4Foo": got class "character", should be or extend class "numeric"
>>
>> a way of knowing that my 'S4Foo' conforms to expectations -- in S3 I can
>> say
>>
>>  l = list(a=1, b=2)
>>  class(l) <- "S3Foo"
>>
>> and have no way of knowing whether this is 'valid' or not; in S4 I would
>> not use this method of creating a class (I'd use my constructor, or
>> perhaps 'new' if I were being undisciplined, and get type checking as
>> above) but if I did I'd be able to find
>>
>>> class(l) <- "S4Foo"
>>> validObject(l)
>>
>> Error in validObject(l) :
>>  invalid class "S4Foo" object: slots in class definition but not in
>> object: "x", "y"
>>
>> an error when I try and access data not in the class (normally I'd have
>> made a constructor, and not use slot access @ directly)
>>
>>> makeS3Foo()$z
>>
>> NULL
>>>
>>> makeS4Foo()@z
>>
>> Error: no slot of name "z" for this object of class "S4Foo"
>>
>> and reflection on the class
>>
>>> getClass("S4Foo")
>>
>> Class “S4Foo” [in ".GlobalEnv"]
>>
>> Slots:
>>
>> Name:        x       y
>> Class: numeric numeric
>>
>>
>> Some of the tools for documenting S3 classes and methods are more mature
>> than for S4 (e.g., package.skeleton does a better job of making a
>> package containing my existing S3 code objects, but I usually use
>> package.skeleton to start a project, not to move existing objects into a
>> new project). The fact that the class has been declared explicitly means
>>  that I'm expected to document it explicitly (in contrast to, say, the
>> result of lm(), which is documented only as the return value of the
>> function) and this then requires decisions about how to group class and
>> method documentation.
>>
>> Using more complicated S4 features can be, well, more complicated. But
>> then these features are not readily available with S3, so...
>>
>> To touch on a couple of other themes in this thread... Using a '.' in a
>> variable name seems like a very bad idea, given the way S3 dispatch
>> works. I like to think of objects as nouns and functions as verbs, and
>> so prefer to capitalize class names (as though they were proper nouns)
>> and lower-case function names (so they have a more dynamic sense). And
>> there are many S4 style issues that are not addressed by the google doc
>> -- setMethod and setGeneric indentation in particular.
>>
>> A great feature of emacs-ESS that I've recently discovered (thanks
>> Deepayan for pointing this out, and also for command completion) is the
>> C-c C-p command when looking at an Rd file during package development;
>> it very nicely returns the formatted help page to emacs.
>>
>> Martin
>>
>>>
>>> Thanks
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list