[Rd] methods cbind2 bind_activation disrupts cbind everywhere

Martin Morgan mtmorgan at fhcrc.org
Sat Sep 15 19:35:46 CEST 2012


On 09/14/2012 07:16 AM, Jeff Ryan wrote:
> Refreshing the memory on performance:
>
> http://r.789695.n4.nabble.com/reduce-limit-number-of-arguments-in-methods-cbind-td921600.html#a921601

short comment below...

>
> My issue had been resolved by a more careful approach taken by timeSeries.
>
> The other option is wholesale deprecation of S4 ... but I won't start
> that conversation with either of you ;-)
>
> Jeff
>
> On Fri, Sep 14, 2012 at 3:00 AM, Martin Maechler
> <maechler at stat.math.ethz.ch> wrote:
>>>>>>> Martin Morgan <mtmorgan at fhcrc.org>
>>>>>>>      on Wed, 12 Sep 2012 15:23:02 -0700 writes:
>>
>>      > The methods package ?cbind2 includes the instruction to
>>      > use via methods:::bind_activation(TRUE).
>>
>> well, "instruction" only if one wants to magically enable its
>> use for cbind(), rbind()
>>
>>      > use via methods:::bind_activation(TRUE). This changes the
>>      > default definition of cbind globally, disrupting proper
>>      > evaluation in packages not using cbind2.
>>
>> { really disrupting?  I seem to only recall examples of
>>    performance loss, but that memory is fading.. }

the problem came up here

 > library(ggplot2)
 > xx = qplot(x = mpg, y = cyl, data = mtcars, facets = . ~ cyl)
 > xx ## pretty picture
 > methods:::bind_activation(TRUE)   ## done in .onLoad of 2nd package
[1] FALSE
 > xx
Error in rep.int("", ncol(r)) : incorrect type for second argument

with methods:::bind_activation(TRUE) begin set by an unrelated package 
in .onLoad(). So I guess to the extent that the default method does not 
"use R's internal code" as advertised on the man page this could be 
considered a bug report.

>>
>>      > evaluation in packages not using cbind2. Is cbind2 a
>>      > hold-over from a time when ... could not be used for
>>      > dispatch?
>>
>> Yes, exactly.
>>
>> As I'm sure you know well, and   ?dotMethods  explains,
>> the ... dispatch was introduced only in R 2.8.0,
>> *and* if you read that help page, it is alluded to several times
>> that the implementation is still not "perfect" because it
>> entails restrictions, and also because its implementation in
>> pure R rather than (partially) C.
>>
>> I had hoped (but not spent work myself, alas!) that there would
>> be evolution from there on, but it seems we forgot about it.
>>
>>      > What is a safe way for a package to use cbind2?
>>
>> to define methods for cbind2 / rbind2, with nice multiple
>> dispatch on both arguments,  as the 'Matrix' package has been

OK, from this I won't hesitate to advise against bind_activation.

@Jeff -- that was a useful reminder of the performance consequences of 
S4 dispatch, even in the more usual case where the dispatch is correct, 
thanks.

Thanks Martin, Martin.

>> doing for many years (long before R 2.8.0) --> Matrix/R/bind.R
>>
>> And then, instead of "bind_activation",
>> Matrix now also defines  cBind() & rBind()
>> as substitutes for cbind(), rbind(),
>> simply as using  methods:::cbind()  which recursively calls
>> cbind2(), rbind2().
>>
>> This has still the big drawback that  cbind() fails on "Matrix"
>> matrices {and even with quite an unhelpful error message!},
>> and useRs must learn to use cBind() instead....
>> not ideal, at all.
>>
>>
>> In an ideal R world,
>> 1) the "..." S4 dispatch would be improved and made faster
>> 2) that part of Matrix would be rewritten, so that  cbind() and rbind()
>>     would work via '...' dispatch on both "Matrix" matrices and
>>     all standard R objects (atomic vectors, "matrix", data.frame,...),
>>     and the use of cBind() and rBind() could be deprecated.
>>
>> I also have a vague recollection that '2)' was not just a job to
>> be done with finite some effort, but rather seemed not easily
>> achievable with the current restriction of "..." dispatch
>> needing all arguments to be of the same superclass.
>> We would have to define
>>
>>   setGeneric("cbind", signature = "...")
>>   setGeneric("rbind", signature = "...")
>>
>>   setMethod("cbind", "Mnumber", function(..., deparse.level=1) {
>>   ........
>>   ........
>>   })
>>
>> similarly to what I do in the Rmpfr package,
>> and where "Mnumber" is a *large* class union... defined in
>> Rmpfr/R/AllClasses.R and if you look in there, you see comments
>> with FIXME ... basically the solution was +- ok, but not really
>> entirely satisfactory to me.
>>
>> Still, we maybe should really discuss to have the above two
>> setGeneric()s in 'methods', and possibly also some class union /
>> superclass  definitions there (that other packages could use!) possibly even
>> with
>> setMethod("cbind", <large-mother-class>,
>>            function(..., deparse.level=1) {
>>            ......
>>            })
>> and of course the same with rbind().
>>
>>
>>      > This came up in the context of complex package
>>      > dependencies in Bioconductor, as detailed in this thread
>>      > (sorry for the html).
>>
>>      > https://stat.ethz.ch/pipermail/bioc-devel/2012-September/003617.html
>>
>>      > --
>>      > Dr. Martin Morgan Fred Hutchinson Cancer Research Center
>>      > 1100 Fairview Ave. N.  PO Box 19024 Seattle, WA 98109
>>
>>      > ______________________________________________
>>      > R-devel at r-project.org mailing list
>>      > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the R-devel mailing list