[BioC] IRanges: interaction with model fitting

Martin Morgan mtmorgan at fhcrc.org
Sun Aug 22 21:49:35 CEST 2010


On 08/22/2010 01:25 AM, Michael Lawrence wrote:
> On Fri, Aug 20, 2010 at 5:32 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> 
>> On 08/20/2010 02:38 PM, Kasper Daniel Hansen wrote:
>>> locfit is a package from CRAN for doing local regression (think
>>> loess).  It does not have a NAMESPACE, and it uses what to my casual
>>> eyes look like a pretty standard model setup.  But there is a strange
>>> thing happening when IRanges is loaded.  An example is best:
>>>
>>> First get some data and define two functions that essentially does the
>> same:
>>>
>>> library(locfit)
>>> utils::data(anorexia, package = "MASS")
>>> myData <- anorexia
>>> tmp <- function(dat) {
>>>     anorex.1 <- glm(Postwt ~ lp(Prewt),
>>>                            family = "gaussian", data = dat)
>>> }
>>> tmp1 <- function(dat1) {
>>>     tmp2 <- function(dat2) {
>>>         anorex.1 <- locfit(Postwt ~ lp(Prewt),
>>>                            family = gaussian, data = dat2)
>>>
>>>     }
>>>     tmp2(dat2 = dat1)
>>> }
>>>
>>> tmp(myData)
>>> tmp1(myData)
>>>
>>> This works great.  Now load IRanges and do the same
>>>
>>> library(IRanges)
>>>> tmp(myData)
>>> Error in terms.formula(formula, data = data) :
>>>   'data' argument is of the wrong type
>>>
>>> Enter a frame number, or 0 to exit
>>>
>>> 1: tmp(myData)
>>> 2: locfit(Postwt ~ lp(Prewt), family = "gaussian", data = dat)
>>> 3: eval(m, sys.frame(sys.parent()))
>>> 4: eval(m, sys.frame(sys.parent()))
>>> 5: eval(expr, envir, enclos)
>>> 6: model.frame(formula = Postwt ~ lp(Prewt), data = dat)
>>> 7: model.frame.default(formula = Postwt ~ lp(Prewt), data = dat)
>>> 8: terms(formula, data = data)
>>> 9: terms.formula(formula, data = data)
>>>
>>> Selection: 0
>>>> tmp1(myData)
>>> Error in inherits(x, "data.frame") : object 'dat2' not found
>>>
>>> Enter a frame number, or 0 to exit
>>>
>>>  1: tmp1(myData)
>>>  2: tmp2(dat2 = dat1)
>>>  3: locfit(Postwt ~ lp(Prewt), family = gaussian, data = dat2)
>>>  4: eval(m, sys.frame(sys.parent()))
>>>  5: eval(m, sys.frame(sys.parent()))
>>>  6: eval(expr, envir, enclos)
>>>  7: model.frame(formula = Postwt ~ lp(Prewt), data = dat2)
>>>  8: model.frame.default(formula = Postwt ~ lp(Prewt), data = dat2)
>>>  9: is.data.frame(data)
>>> 10: inherits(x, "data.frame")
>>>
>>> Selection: 0
>>>
>>> Clearly something seems wrong.  The error messages for tmp and tmp1
>>> are different.  Perhaps the lacking NAMESPACE for locfit makes a
>>> difference.  And locfit sets up the design matrix using a slightly
>>> different set of commands, that to my eye mostly look like an older
>>> paradigm.
>>>
>>> Any idea on what happens and why this is affected by IRanges?
>>
>> It traces to IRanges' promotion of eval to a generic. A workaround is
>>
>> library(IRanges)
>> eval = base::eval
>>
>> Creating the generic changes the value of 'envir' (the lazy evaluation
>> of sys.frame(sys.parent())) in the call, as shown with
>>
>>> trace(eval, signature=c("ANY", "ANY"), tracer=quote(print(ls(envir))))
>>> tmp1(anorexia) ## 'locfit' envir
>> Tracing eval(m, sys.frame(sys.parent())) on entry
>>  [1] "base"    "cens"    "data"    "formula" "geth"    "i"       "lfproc"
>>  [8] "m"       "subset"  "Terms"   "weights" "z"
>> Error in inherits(x, "data.frame") : object 'dat2' not found
>>
>> verus
>>
>>> eval = base::eval
>>> trace(eval, tracer=quote(print(ls(envir))))
>> [1] "eval"
>>> tmp1(anorexia) ## 'tmp2' envir
>> Tracing eval(m, sys.frame(sys.parent())) on entry
>> [1] "dat2"
>>
>> I don't think there is a robust way around this, but maybe others have a
>> good idea?
>>
>>
> I just added a non-standard generic for "eval" that forces the evaluation of
> the arguments. Works in this case. Is that sensible?

I can't think of an example where this will be positively wrong; it
won't surprise me when one is reported. I'm ok with the
nonstandardGeneric until then. Martin

> 
> Michael
> 
> 
> 
>> Martin
>>>
>>> Kasper
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> --
>> Martin Morgan
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>>
> 


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list