[BioC] On extending class ExpressionSet

Martin Morgan mtmorgan at fhcrc.org
Thu Dec 16 07:46:06 CET 2010


On 12/15/2010 10:05 PM, Renaud Gaujoux wrote:
> Hi Morgan,
> 
> thank you very much for the explanation on the initialize method for the
> class ExpressionSet.
> I always found the S4 initialization mechanism a bit difficult to handle
> in the case of inheritance. It often requires to manually pass slots to
> the parent class, which should be automatic: each class -- developer --
> has to deal only with its own data members.
> 
> Maybe I am too optimistic, but would something like the following
> definition of initialize,ExpressionSet-method fix the problem without
> changing the current behaviour for the dependent packages (since I
> assume they all pass assayData as a first argument):

It's really hard to get S4 right, and the constraints on class and
method definition are only weakly enforced, so people have come up with
solutions that are not really optimal. From ?initialize, we have

     ...: data to include in the new object.  Named arguments
          correspond to slots in the class definition. Unnamed
          arguments must be objects from classes that this class
          extends.

Note the plural 'classes' and

setClass("A", representation=representation(a="integer"))
setClass("B", representation=representation(b="numeric"))
setClass("C", contains=c("A", "B"))

> new("C", new("A", a=1:5), new("B", b=pi))
An object of class "C"
Slot "a":
[1] 1 2 3 4 5

Slot "b":
[1] 3.141593

so your solution (checking the first arg) isn't enough.

I'll mention that the documentation can point to solutions that will not
work well with inheritance, e.g., from ?"initialize-methods" the two
examples

         setMethod("initialize", "traceable",
           function(.Object, def, tracer, exit, at, print) ...
         )

and

         function(.Object, x, ...) {
           Object <- callNextMethod(.Object, ...)
           if(!missing(x)) { # do something with x

both use arguments that are inconsistent with the description of
?initialize.

Also, ExpressionSet was (re)-developed five years ago, and at least some
of the more recent classes, e.g., in IRanges, are more appropriately
implemented.

Martin

> 
> structure(function (.Object, ...)
> {
>     # check if the first argument is an ExpressionSet
>     # if so: initialize the object with it and tells .local not to
>     # overwrite slots corresponding to missing arguments.
>     # otherwise: s
>     overwrite.missing<- TRUE
>     dotargs<- list(...)   
>     if( length(dotargs)>  1&&  is(dotargs[[1]], 'ExpressionSet') ){       
>         .Object<- dotargs[[1]]       
>         overwrite.missing<- FALSE
>         dotargs<- dotargs[-1]
>     }       
> 
>     # .local should initialize (i.e. overwrite) a slot of .Object with
> its prototype only if overwrite.missing=TRUE, or in any case with the
> corresponding non missing argument for this slot.
>      .local<- function (.Object, overwrite.missing=TRUE, assayData,
> phenoData, featureData,
>          exprs = new("matrix"), ...)
>      {                  if (overwrite.missing&&  missing(assayData)) {
>     # stuff ...
>     }
>     # other stuff ...
>      }
> 
>     # call .local with overwrite.missing as its first argument
>     do.call(.local, c(list(.Object, overwrite.missing), dotargs))
>     
> }
> 
> I think this would allow:
> 
> eset<- new('ExpressionSet', exprs=matrix(0,10,5))
> new('ExpressionSet', eset) # simple copy constructor
> new('ExpressionSet', eset, exprs=matrix(0,20,3)) # overwrite original
> exprs with the one given in argument
> # etc... with other slots
> # and the current behaviour should also work
> new('ExpressionSet', assayData=assayData(eset), exprs=matrix(0,10,5))
> # and initialize will still set the object to its prototype if directly
> called (which a behaviour one might not want to change as it could be
> use by other packages)
> initialize(eset)
> 
> 
> Thank you.
> Renaud
> 
> 
> 
> 
> 
> On 15/12/2010 20:21, Martin Morgan wrote:
>> On 12/15/2010 09:42 AM, Renaud Gaujoux wrote:
>>> Hi,
>>>
>>> I am trying to extend class ExpressionSet in a very simple way to add an
>>> extra slot.
>>> Now suppose I have a valid ExpressionSet object, I want to create an
>>> object of class 'A' as follows (this always worked with other S4 classes
>>> I defined):
>>>
>>> library(Biobase)
>>> setClass('A', representation(extraslot='list'),
>>> contains='ExpressionSet')
>>> eset<- new('ExpressionSet')
>>> new('A', eset)
>>>
>>> # this throws the error:
>>> Error in function (classes, fdef, mtable)  :
>>>    unable to find an inherited method for function
>>> "annotatedDataFrameFrom", for signature "ExpressionSet"
>>> #Note: this also does not work with a non-empty ExpressionSet object.
>>>
>>> Is this normal? Is there a specific way to extend the class
>>> ExpressionSet?
>>> The classes I found that extend ExpressionSet add an extra element in
>>> assayData, and from what I saw it requires defining an initialize method
>>> to pass all the standard parameters to the underlying ExpressionSet
>>> object (exprs, phenoData, featureData, etc...)
>> Hi Renaud --
>>
>> This is an unfortunate consequence of the 'initialize' method that your
>> class inherits from ExpressionSet. 'new' uses the prototype of 'A' to
>> create .Object, and then passes .Object and other arguments to 'new'
>> down to 'initialize'.  The 'initialize' method inherited from
>> ExpressionSet is in part
>>
>>> head(selectMethod(initialize, 'A'))
>> 1 structure(function (.Object, ...)
>> 2 {
>> 3     .local<- function (.Object, assayData, phenoData, featureData,
>> 4         exprs = new("matrix"), ...)
>> 5     {
>> 6         if (missing(assayData)) {
>>
>> with .local invoked so as .local(.Object, ...) in the body of
>> initialize. So your 'eset' is the seen as the second argument, and
>> matches by position with the argument 'assayData'; this is not expected
>> to be an ExpressionSet, and trouble ensues. In retrospect the 'right'
>> signature for initialize,ExpressionSet-method would have placed the
>> named arguments after ..., with the user needing to supply named
>> assayData= etc arguments.
>>
>> An additional problem is that the initialize,ExpressionSet-method
>> assumes that .Object is from its prototype, so for instance an exprs
>> with non-zero dimensions is overwritten
>>
>>> eset<- new("ExpressionSet", exprs=matrix(0,10,5))
>>> eset
>> ExpressionSet (storageMode: lockedEnvironment)
>> assayData: 10 features, 5 samples
>>    element names: exprs
>> protocolData: none
>> phenoData: none
>> featureData: none
>> experimentData: use 'experimentData(object)'
>> Annotation:
>>> initialize(eset)
>> ExpressionSet (storageMode: lockedEnvironment)
>> assayData: 0 features, 0 samples
>>    element names: exprs
>> protocolData: none
>> phenoData: none
>> featureData: none
>> experimentData: use 'experimentData(object)'
>> Annotation:
>>
>> You're stuck working with the initialize,ExpressionSet-method as defined
>> (changing it now would disrupt a lot of package and user code), probably
>> the easiest way being to write an appropriate initialize method or,
>> probably better given the pitfalls of doing this correctly, write a
>> constructor that does what you want
>>
>> library(Biobase)
>> setClass('A', representation(extraslot='list'),
>>           contains='ExpressionSet',
>>           prototype=prototype(extraslot=list(a=1:5)))
>>
>> setGeneric("A", function(x, ...) standardGeneric("A"))
>> setMethod(A, "missing", function(x, ...) new("A", ...))
>> setMethod(A, "ExpressionSet", function(x, ...) {
>>      new("A", assayData=assayData(x), phenoData=phenoData(x),
>>          featureData=featureData(x), ...)  ## protocalData here too?
>> })
>>
>> ## test
>> eset<- new('ExpressionSet', exprs=matrix(0, 10, 5))
>> A(eset)
>> A()
>> A(eset)@extraslot
>> A()@extraslot
>> A(eset, extraslot=list(b=5:1))@extraslot
>> A(extraslot=list(b=5:1))@extraslot
>>
>> Martin
>>
>>> Thank you for any insight on the matter.
>>> Renaud
>>>
>>> sessionInfo:
>>> R version 2.12.0 (2010-10-15)
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>
>>> locale:
>>>   [1] LC_CTYPE=en_ZA.utf8       LC_NUMERIC=C LC_TIME=en_ZA.utf8
>>> LC_COLLATE=en_ZA.utf8     LC_MONETARY=C       LC_MESSAGES=en_ZA.utf8
>>> LC_PAPER=en_ZA.utf8
>>>   [8] LC_NAME=C                 LC_ADDRESS=C              LC_TELEPHONE=C
>>>             LC_MEASUREMENT=en_ZA.utf8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] Biobase_2.8.0
>>>
>>>
>>>
>>>
>>> ###
>>> UNIVERSITY OF CAPE TOWN
>>> This e-mail is subject to the UCT ICT policies and e-mai...{{dropped:5}}
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
> 
> 
> 
> ###
> UNIVERSITY OF CAPE TOWN
> This e-mail is subject to the UCT ICT policies and e-mail disclaimer
> published on our website at
> http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from
> +27 21 650 9111. This e-mail is intended only for the person(s) to whom
> it is addressed. If the e-mail has reached you in error, please notify
> the author. If you are not the intended recipient of the e-mail you may
> not use, disclose, copy, redirect or print the content. If this e-mail
> is not related to the business of UCT it is sent by the sender in the
> sender's individual capacity.
> 
> ###
> 
> 


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list