[R] Google's R Style Guide

Martin Morgan mtmorgan at fhcrc.org
Tue Sep 1 14:58:12 CEST 2009


Corrado wrote:
> Thanks Duncan, Spencer,
> 
> To clarify, the situation is:
> 
> 1) I have no reasons to choose S3 on S4 or vice versa, or any other coding 
> convention
> 2) Our group has not done any OO developing in R and I would be the first, so I 
> can set up the standards
> 3) I am starting from scratch with a new package, so I do not have any code I 
> need to re-use.

One consideration might be the domain in which you are doing
development; Bioconductor for instance makes extensive use of S4 and
your efforts at learning to develop S4 would pay off both in your own
code and in understanding other packages.

> 4) I am an R OO newbie, so whatever I can learn from the beginning what is 
> better and good for me.
> 
> So the questions would be two:
> 
> 1) What coding style guide should we / I follow? Is the google style guide 
> good, or is there something better / more prescriptive which makes our 
> research group life easier? 
> 2) What class type should I use? From what you two say, I should use S3 
> because is easier to use .... what are the disadvantages? Is there an 
> advantages / disadvantages table for S3 and S4 classes?

It seems relevant to compare S3 and S4 code for doing S3-style
programming, leaving more 'advanced' S4 for another day. In S3 I might
define a simple class and method as


makeS3Foo <-
    function(x=numeric(), y=numeric())
{
    if (class(x) != "numeric")
        stop("'x' must be numeric")
    if (class(y) != "numeric")
        stop("'y' must be numeric")
    l <- list(x=x, y=y)
    class(l) <- "S3Foo"
    l
}

doS3 <- function(x, ...) NextMethod("doS3")

doS3.default <- function(x, ...) "doS3 default"

doS3.S3Foo <- function(x, ...) "doS3 of S3Foo"


with an example of use being

> doS3(makeS3Foo())
[1] "doS3 of S3Foo"


I use 'makeS3Foo' as a constructor, so that whenever I make an instance
of what I'm calling class S3Foo, I have some guarantees about its structure.

The S4 implementation might be


setClass("S4Foo", representation(x="numeric", y="numeric"))

makeS4Foo <-
    function(x = numeric(), y=numeric(), ...)
{
    new("S4Foo", x=x, y=y, ...)
}

setGeneric("doS4", function(x, ...) standardGeneric("doS4"),
           useAsDefault=function(x, ...) "do default")

setMethod("doS4", "S4Foo", function(x, ...) "doS4 of S4Foo")

and use with

> doS4(makeS4Foo())
[1] "doS4 of S4Foo"

It seems like the translation between the two is really quite
transparent, and equally arcane to someone new to R.

Some things I get from S4 are a level of automatic type checking

> makeS4Foo(x="bar")
Error in validObject(.Object) :
  invalid class "S4Foo" object: invalid object for slot "x" in class
"S4Foo": got class "character", should be or extend class "numeric"

a way of knowing that my 'S4Foo' conforms to expectations -- in S3 I can say

  l = list(a=1, b=2)
  class(l) <- "S3Foo"

and have no way of knowing whether this is 'valid' or not; in S4 I would
not use this method of creating a class (I'd use my constructor, or
perhaps 'new' if I were being undisciplined, and get type checking as
above) but if I did I'd be able to find

> class(l) <- "S4Foo"
> validObject(l)
Error in validObject(l) :
  invalid class "S4Foo" object: slots in class definition but not in
object: "x", "y"

an error when I try and access data not in the class (normally I'd have
made a constructor, and not use slot access @ directly)

> makeS3Foo()$z
NULL
> makeS4Foo()@z
Error: no slot of name "z" for this object of class "S4Foo"

and reflection on the class

> getClass("S4Foo")
Class “S4Foo” [in ".GlobalEnv"]

Slots:

Name:        x       y
Class: numeric numeric


Some of the tools for documenting S3 classes and methods are more mature
than for S4 (e.g., package.skeleton does a better job of making a
package containing my existing S3 code objects, but I usually use
package.skeleton to start a project, not to move existing objects into a
new project). The fact that the class has been declared explicitly means
 that I'm expected to document it explicitly (in contrast to, say, the
result of lm(), which is documented only as the return value of the
function) and this then requires decisions about how to group class and
method documentation.

Using more complicated S4 features can be, well, more complicated. But
then these features are not readily available with S3, so...

To touch on a couple of other themes in this thread... Using a '.' in a
variable name seems like a very bad idea, given the way S3 dispatch
works. I like to think of objects as nouns and functions as verbs, and
so prefer to capitalize class names (as though they were proper nouns)
and lower-case function names (so they have a more dynamic sense). And
there are many S4 style issues that are not addressed by the google doc
-- setMethod and setGeneric indentation in particular.

A great feature of emacs-ESS that I've recently discovered (thanks
Deepayan for pointing this out, and also for command completion) is the
C-c C-p command when looking at an Rd file during package development;
it very nicely returns the formatted help page to emacs.

Martin

> 
> Thanks




More information about the R-help mailing list