[BioC] IRanges/List oddity: do.call of `c` on a list of IRangesList returns "list" only when the list is named

Hervé Pagès hpages at fhcrc.org
Sat Dec 1 00:28:44 CET 2012


Hi Malcolm,

The problem you are describing can be reproduced by calling c()
directly on S4 objects.

   * With unnamed arguments:

     > c(IRanges(), IRanges())
     IRanges of length 0

     > c(Rle(), Rle())
     logical-Rle of length 0 with 0 runs
       Lengths:
       Values :

   * With named arguments:

     > c(a=IRanges(),b=IRanges())
     $a
     IRanges of length 0

     $b
     IRanges of length 0

     > c(a=Rle(), b=Rle())
     $a
     logical-Rle of length 0 with 0 runs
       Lengths:
       Values :

     $b
     logical-Rle of length 0 with 0 runs
       Lengths:
       Values :

This statement (found in man page for base::c()) is showing what the
root of the problem is:

   S4 methods:

      This function is S4 generic, but with argument list ‘(x, ...,
      recursive = FALSE)’.

Note that, to make things a little bit more confusing, it's not totally
accurate that c() is an S4 generic, at least not on a fresh session:

   > isGeneric("c")
   [1] FALSE

So my understanding of the above statement is that c() will
automatically be turned into an S4 generic at the moment you try
to define an S4 method for it, and, for obscure reasons that I'm not
sure I understand, the argument list used in the definition of this
S4 method must start with 'x'. The consequence of all this is that
dispatch will happen on 'x' so if named arguments are passed with
a name that is not 'x', dispatch will fail and the default method
(which is base::c()) will be called :-b

This explains why things work as expected in the following situations:

   > c(IRanges(), b=IRanges())
   IRanges of length 0

   > c(a=IRanges(), IRanges())
   IRanges of length 0

   > c(a=IRanges(), x=IRanges())
   IRanges of length 0

But when all the arguments are named with names != 'x', then nothing
is passed to 'x' and dispatch fails.

I didn't have much luck so far with my attempts to work around this:

   1. Trying to change the signature of the c() generic:

      > setGeneric("c", signature="...")
      Error in setGeneric("c", signature = "...") :
        ‘c’ is a primitive function;  methods can be defined, but
       the generic function is implicit, and cannot be changed.

   2. Trying to dispatch on "missing" or "ANY":

      > setMethod("c", "missing", function(x, ..., recursive=FALSE) "YES!")
      Error in setMethod("c", "missing", function(x, ..., recursive = 
FALSE) "YES!") :
        the method for function ‘c’ and signature x="missing" is sealed 
and cannot be re-defined

      > setMethod("c", "ANY", function(x, ..., recursive=FALSE) "YES!")
Error in setMethod("c", "ANY", function(x, ..., recursive = FALSE) 
"YES!") :
        the method for function ‘c’ and signature x="ANY" is sealed and 
cannot be re-defined

With old versions of R dispatch on ... was not possible i.e. ... was not
allowed to be in the signature of the generic. This was changed in
recent versions of R and we're already using this new feature for a
few S4 generics defined in BiocGenerics e.g. for cbind() and rbind():

   > library(BiocGenerics)
   > rbind
   standardGeneric for "rbind" defined from package "BiocGenerics"

   function (..., deparse.level = 1)
   standardGeneric("rbind")
   <environment: 0x29b96b0>
   Methods may be defined for arguments: ...
   Use  showMethods("rbind")  for currently available ones.

And dispatch works as expected, with or without named arguments:

   > rbind(a=DataFrame(X=1:3, Y=11:13), b=DataFrame(X=1:3, Y=21:23))
   DataFrame with 6 rows and 2 columns
             X         Y
     <integer> <integer>
   1         1        11
   2         2        12
   3         3        13
   4         1        21
   5         2        22
   6         3        23

   > rbind(DataFrame(X=1:3, Y=11:13), DataFrame(X=1:3, Y=21:23))
   DataFrame with 6 rows and 2 columns
             X         Y
     <integer> <integer>
   1         1        11
   2         2        12
   3         3        13
   4         1        21
   5         2        22
   6         3        23

So I wonder if the weird behavior of c() is still justified.

Comments/suggestions to address this are welcome.

Thanks,
H.


On 11/30/2012 11:56 AM, Cook, Malcolm wrote:
> Hi,
>
> The following shows that do.call of `c` on a list of IRangesList returns "list" only when the list is named.
>
>
>> library(IRanges)
>> example(IRangesList)
>> class(x)
> [1] "CompressedIRangesList"
> attr(,"package")
> [1] "IRanges"
>> class(do.call(c,list(x1=x,x2=x)))
> [1] "list"
>
> I am confused this.
>
> I would not expect the fact that the list is named to have any impact on the result.
>
> But, look, omitting the list names the class is now an IRangesList
>
>> class(do.call(c,list(x,x)))
> [1] "CompressedIRangesList"
> attr(,"package")
> [1] "IRanges"
>
>
>> class(c(x,x))
> [1] "CompressedIRangesList"
> attr(,"package")
> [1] "IRanges"
>
> A 'workaround' is to unname the list, as demonstrated:
>
>> class(do.call(c,unname(list(x1=x,x2=x))))
> [1] "CompressedIRangesList"
> attr(,"package")
> [1] "IRanges"
>
> But, why does having a 'names' attribute effect the behavior of do.calling `c` so much as to change the class returned?
>
>
> Thanks for your help/education.....
>
> Malcolm Cook
> Computational Biology - Stowers Institute for Medical Research
>
>
>> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] IRanges_1.16.4     BiocGenerics_0.4.0
>
> loaded via a namespace (and not attached):
>   [1] AnnotationDbi_1.20.3   BSgenome_1.26.1        Biobase_2.18.0         Biostrings_2.26.2      DBI_0.2-5              GenomicFeatures_1.10.1 GenomicRanges_1.10.5   RCurl_1.95-3           RSQLite_0.11.2         Rsamtools_1.10.2       XML_3.95-0.1           biomaRt_2.14.0         bitops_1.0-4.2         colorspace_1.2-0       data.table_1.8.6       functional_0.1         graph_1.36.1           gtools_2.7.0           parallel_2.15.1        rtracklayer_1.18.1     stats4_2.15.1          tools_2.15.1           zlibbioc_1.4.0
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list