[Rd] Proper way to define cbind, rbind for s4 classes in package

Martin Maechler maechler at lynne.stat.math.ethz.ch
Mon Feb 2 12:32:48 CET 2015


>>>>> Michael Lawrence <lawrence.michael at gene.com>
>>>>>     on Sun, 1 Feb 2015 19:23:06 -0800 writes:

    > I've implemented the proposed changes in
    > R-devel. Minimally tested, so please try it. It should
    > delegate to r/cbind2 when there is at least one S4
    > argument and S3 dispatch fails (so you'll probably want to
    > add an S3 method for your class to introduce a conflict,
    > otherwise it will dispatch to cbind.data.frame if one of
    > the args is a data.frame). There may no longer be a need
    > for cBind() and rBind().

    > Michael

This sounds great!   Thank you very much, Michael!
:-) :-)

... but .... :-(  experiments with the Matrix package (and R
devel with your change), show a remaining buglet with treating of dimnames :

   > M1 <- Matrix(m1 <- matrix(1:12, 3,4))
   > cbind(m1, MM = -1)
		 MM
   [1,] 1 4 7 10 -1
   [2,] 2 5 8 11 -1
   [3,] 3 6 9 12 -1
   > cbind(M1, MM = -1)   ## ---- notice the "..."
   3 x 5 Matrix of class "dgeMatrix"
		 ...
   [1,] 1 4 7 10  -1
   [2,] 2 5 8 11  -1
   [3,] 3 6 9 12  -1
   > rbind(R1 = 10:11, m1)
      [,1] [,2] [,3] [,4]
   R1   10   11   10   11
	 1    4    7   10
	 2    5    8   11
	 3    6    9   12
   > rbind(R1 = 10:11, M1) ## --- notice the 'deparse.level'
   4 x 4 Matrix of class "dgeMatrix"
		 [,1] [,2] [,3] [,4]
   deparse.level   10   11   10   11
		    1    4    7   10
		    2    5    8   11
		    3    6    9   12
   > 

Also, it seems you are not observing the 'deparse.level'
argument at all: 
Looking at the last three lines of the example in  ?cbind,

     rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 0) # middle 2 rownames
     rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 1) # 3 rownames (default)
     rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 2) # 4 rownames
     
but using a Matrix matrix 'dd', we see that (row)names
construction needs to amended:

  > (dd <- Matrix(rbind(c(0:1,0,0))))
  1 x 4 sparse Matrix of class "dgCMatrix"

  [1,] . 1 . .

  > rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 0) # middle 2 rownames
  4 x 4 sparse Matrix of class "dgCMatrix"

  deparse.level  1  2  3  4
  c              2  2  2  2
  a++           10 10 10 10
		 .  1  .  .
  > rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 1) # 3 rownames (default)
  4 x 4 sparse Matrix of class "dgCMatrix"

  deparse.level  1  2  3  4
  c              2  2  2  2
  a++           10 10 10 10
		 .  1  .  .
  > rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 2) # 4 rownames
  4 x 4 sparse Matrix of class "dgCMatrix"

  deparse.level  1  2  3  4
  c              2  2  2  2
  a++           10 10 10 10
		 .  1  .  .
  > 



    > On Mon, Jan 26, 2015 at 3:55 AM, Martin Maechler <
    > maechler at lynne.stat.math.ethz.ch> wrote:

    >> >>>>> Michael Lawrence <lawrence.michael at gene.com> >>>>>
    >> on Sat, 24 Jan 2015 06:39:37 -0800 writes:
    >> 
    >> > On Sat, Jan 24, 2015 at 12:58 AM, Mario Annau >
    >> <mario.annau at gmail.com> wrote: >> Hi all, this question
    >> has already been posted on >> stackoverflow, however
    >> without success, see also
    >> >>
    >> http://stackoverflow.com/questions/27886535/proper-way-to-use-cbind-rbind-with-s4-classes-in-package
    >> .
    >> >>
    >> >> I have written a package using S4 classes and would
    >> like >> to use the functions rbind, cbind with these
    >> defined >> classes.
    >> >>
    >> >> Since it does not seem to be possible to define rbind
    >> and >> cbind directly as S4 methods (see ?cBind) I
    >> defined >> rbind2 and cbind2 instead:
    >> >>
    >> 
    >> > This needs some clarification. It certainly is possible
    >> to > define cbind and rbind methods. The BiocGenerics
    >> package > defines generics for those and many methods are
    >> defined by > e.g. S4Vectors, IRanges, etc.  The issue is
    >> that dispatch > on "..." is singular, i.e., you can only
    >> specify one class > that all args in "..." must share
    >> (potentially through > inheritance).
    >> 
    >> > Thus, trying to combine objects from a > different
    >> hierarchy (or non-S4 objects) will not > work.
    >> 
    >> Yes, indeed, that's the drawback
    >> 
    >> I've been there almost surely before everyone else, with
    >> the Matrix package...  and I have been the author of
    >> cbind2(), rbind2(), and of course, of cBind(), and
    >> rBind().
    >> 
    >> At the time when I introduced these, the above
    >> possibility of writing S4 methods for '...'  where not
    >> yet part of R.
    >> 
    >> > This has not been a huge problem for us in >
    >> practice. For example, we have a DataFrame object that >
    >> mimics data.frame. To cbind a data.frame with a
    >> DataFrame, > the user can just call the DataFrame() >
    >> constructor. rbind() between different data structures is
    >> > much less common.
    >> 
    >> well... yes and no.  Think of using the Matrix package,
    >> maybe with another package that defines another
    >> generalized matrix class...  It would be nice if things
    >> worked automatically / perfectly there.
    >> 
    >> > The cBind and rBind functions in Matrix (and the
    >> r/cbind > that get installed by bind_activation, the code
    >> is shared) > work by recursing, dropping the first
    >> argument until two > are left, and then combining with
    >> r/cbind2(). The Biobase > package uses a similar strategy
    >> to mimic c() via its > non-standard combine()
    >> generic. The nice thing about the > combine() approach is
    >> the user entry point and the generic > are the same,
    >> instead of having methods on rbind2() and > the user
    >> calling rBind().
    >> 
    >> > I would argue that bind_activation(TRUE) should be >
    >> discouraged,
    >> 
    >> Yes, you are right Michael; it should be discouraged at
    >> least to be run in a *package*.  One could think of its
    >> use by an explicit user call.
    >> 
    >> > because it replaces the native rbind and > cbind with
    >> recursive variants that are going to cause > problems,
    >> performance and otherwise. This is why it is >
    >> hidden. Perhaps a reasonable compromise would be for the
    >> > native cbind and rbind to check whether any arguments
    >> are > S4 and if so, resort to recursion. Recursion does
    >> seem to > be a clean way to implement "type promotion",
    >> i.e., to > answer the question "which type should the
    >> result be when > faced with mixed-type args?".
    >> 
    >> Exactly.  That has been my idea at the time ..  ((yes,
    >> I'm also the author of the bind_activation()
    >> "(mis)functionality".))
    >> 
    >> > Hopefully others have better ideas.
    >> 
    >> that would be great.
    >> 
    >> And even if not, it would be great if we could implement
    >> your idea > Perhaps a reasonable compromise would be for
    >> the > native cbind and rbind to check whether any
    >> arguments are > S4 and if so, resort to recursion.
    >> 
    >> without a noticable performance penalty in the case of no
    >> S4 arguments.
    >> 
    >> Martin
    >> 
    >> 
    >> > Michael
    >> 
    >> >> setMethod("rbind2", signature(x="ClassA", y = "ANY"),
    >> >> function(x, y) { # Do stuff ...  })
    >> >>
    >> >> setMethod("cbind2", signature(x="ClassA", y = "ANY"),
    >> >> function(x, y) { # Do stuff ...  })
    >> >>
    >> >> >From ?cbind2 I learned that these functions need to
    >> be >> activated using methods:::bind_activation to
    >> replace >> rbind and cbind from base.
    >> >>
    >> >> I included the call in the package file R/zzz.R using
    >> the >> .onLoad function:
    >> >>
    >> >> .onLoad <- function(...) { # Bind activation of
    >> cbind(2) >> and rbind(2) for S4 classes >>
    >> methods:::bind_activation(TRUE) } This works as >>
    >> expected. However, running R CMD check I am now getting
    >> >> the following NOTE since I am using an unexported >>
    >> function in methods:
    >> >>
    >> >> * checking dependencies in R code ... NOTE Unexported
    >> >> object imported by a ':::' call: >>
    >> 'methods:::bind_activation' See the note in ?`:::` about
    >> >> the use of this operator.  How can I get rid of the
    >> NOTE >> and what is the proper way to define the methods
    >> cbind >> and rbind for S4 classes in a package?
    >> >>
    >> >> Best, mario
    >> >>
    >> >> ______________________________________________ >>
    >> R-devel at r-project.org mailing list >>
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 
    >> > ______________________________________________ >
    >> R-devel at r-project.org mailing list >
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >>



More information about the R-devel mailing list