[Rd] Proper way to define cbind, rbind for s4 classes in package

Martin Maechler maechler at lynne.stat.math.ethz.ch
Mon Jan 26 12:55:18 CET 2015


>>>>> Michael Lawrence <lawrence.michael at gene.com>
>>>>>     on Sat, 24 Jan 2015 06:39:37 -0800 writes:

    > On Sat, Jan 24, 2015 at 12:58 AM, Mario Annau
    > <mario.annau at gmail.com> wrote:
    >> Hi all, this question has already been posted on
    >> stackoverflow, however without success, see also
    >> http://stackoverflow.com/questions/27886535/proper-way-to-use-cbind-rbind-with-s4-classes-in-package.
    >> 
    >> I have written a package using S4 classes and would like
    >> to use the functions rbind, cbind with these defined
    >> classes.
    >> 
    >> Since it does not seem to be possible to define rbind and
    >> cbind directly as S4 methods (see ?cBind) I defined
    >> rbind2 and cbind2 instead:
    >> 

    > This needs some clarification. It certainly is possible to
    > define cbind and rbind methods. The BiocGenerics package
    > defines generics for those and many methods are defined by
    > e.g. S4Vectors, IRanges, etc.  The issue is that dispatch
    > on "..." is singular, i.e., you can only specify one class
    > that all args in "..." must share (potentially through
    > inheritance).

    > Thus, trying to combine objects from a
    > different hierarchy (or non-S4 objects) will not
    > work. 

Yes, indeed, that's the drawback

I've been there almost surely before everyone else, with the
Matrix package...
and I have been the author of  
    cbind2(), rbind2(), and of course, of  cBind(), and rBind().

At the time when I introduced these, the above possibility of
writing S4 methods for  '...'  where not yet part of R.

    > This has not been a huge problem for us in
    > practice. For example, we have a DataFrame object that
    > mimics data.frame. To cbind a data.frame with a DataFrame,
    > the user can just call the DataFrame()
    > constructor. rbind() between different data structures is
    > much less common.

well... yes and no.  Think of using the Matrix package, maybe
with another package that defines another generalized matrix class...
It would be nice if things worked automatically / perfectly there.

    > The cBind and rBind functions in Matrix (and the r/cbind
    > that get installed by bind_activation, the code is shared)
    > work by recursing, dropping the first argument until two
    > are left, and then combining with r/cbind2(). The Biobase
    > package uses a similar strategy to mimic c() via its
    > non-standard combine() generic. The nice thing about the
    > combine() approach is the user entry point and the generic
    > are the same, instead of having methods on rbind2() and
    > the user calling rBind().

    > I would argue that bind_activation(TRUE) should be
    > discouraged, 

Yes, you are right Michael; it should be discouraged at least to
be run in a *package*.
One could think of its use by an explicit user call.

    > because it replaces the native rbind and
    > cbind with recursive variants that are going to cause
    > problems, performance and otherwise. This is why it is
    > hidden. Perhaps a reasonable compromise would be for the
    > native cbind and rbind to check whether any arguments are
    > S4 and if so, resort to recursion. Recursion does seem to
    > be a clean way to implement "type promotion", i.e., to
    > answer the question "which type should the result be when
    > faced with mixed-type args?".

Exactly.  That has been my idea at the time ..
((yes, I'm also the author of the  bind_activation() 
  "(mis)functionality".))

    > Hopefully others have better ideas.

that would be great.

And even if not, it would be great if we could implement your
idea
    > Perhaps a reasonable compromise would be for the
    > native cbind and rbind to check whether any arguments are
    > S4 and if so, resort to recursion.

without a noticable performance penalty in the case of no S4
arguments.

Martin


    > Michael

    >> setMethod("rbind2", signature(x="ClassA", y = "ANY"),
    >> function(x, y) { # Do stuff ...  })
    >> 
    >> setMethod("cbind2", signature(x="ClassA", y = "ANY"),
    >> function(x, y) { # Do stuff ...  })
    >> 
    >> >From ?cbind2 I learned that these functions need to be
    >> activated using methods:::bind_activation to replace
    >> rbind and cbind from base.
    >> 
    >> I included the call in the package file R/zzz.R using the
    >> .onLoad function:
    >> 
    >> .onLoad <- function(...) { # Bind activation of cbind(2)
    >> and rbind(2) for S4 classes
    >> methods:::bind_activation(TRUE) } This works as
    >> expected. However, running R CMD check I am now getting
    >> the following NOTE since I am using an unexported
    >> function in methods:
    >> 
    >> * checking dependencies in R code ... NOTE Unexported
    >> object imported by a ':::' call:
    >> 'methods:::bind_activation' See the note in ?`:::` about
    >> the use of this operator.  How can I get rid of the NOTE
    >> and what is the proper way to define the methods cbind
    >> and rbind for S4 classes in a package?
    >> 
    >> Best, mario
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel

    > ______________________________________________
    > R-devel at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list