[BioC] IRanges: interaction with model fitting

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Mon Aug 23 15:12:52 CEST 2010


Thanks a lot for the help.

I agree with the fix and I am also surprised that parent.frame() is
different from sys.frame(sys.parent()).

I'll contact the locfit maintainer and see if I can get a NAMESPACE
and use parent.frame and possibly a few other things.  But that only
addresses one out of many packages on CRAN.

Kasper


On Sun, Aug 22, 2010 at 5:10 PM, Michael Lawrence
<lawrence.michael at gene.com> wrote:
> It's not a fix-all, e.g. callNextMethod with arguments (admittedly unusual)
> would still behave unexpectedly.  But, I agree, it's hard to think of cases
> where forcing either 'envir' or 'enclos' to "eval" would be undesirable.
> It's not like anyone wants to deparse(substitute()) them.
>
> Michael
>
> On Sun, Aug 22, 2010 at 12:49 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>>
>> On 08/22/2010 01:25 AM, Michael Lawrence wrote:
>> > On Fri, Aug 20, 2010 at 5:32 PM, Martin Morgan <mtmorgan at fhcrc.org>
>> > wrote:
>> >
>> >> On 08/20/2010 02:38 PM, Kasper Daniel Hansen wrote:
>> >>> locfit is a package from CRAN for doing local regression (think
>> >>> loess).  It does not have a NAMESPACE, and it uses what to my casual
>> >>> eyes look like a pretty standard model setup.  But there is a strange
>> >>> thing happening when IRanges is loaded.  An example is best:
>> >>>
>> >>> First get some data and define two functions that essentially does the
>> >> same:
>> >>>
>> >>> library(locfit)
>> >>> utils::data(anorexia, package = "MASS")
>> >>> myData <- anorexia
>> >>> tmp <- function(dat) {
>> >>>     anorex.1 <- glm(Postwt ~ lp(Prewt),
>> >>>                            family = "gaussian", data = dat)
>> >>> }
>> >>> tmp1 <- function(dat1) {
>> >>>     tmp2 <- function(dat2) {
>> >>>         anorex.1 <- locfit(Postwt ~ lp(Prewt),
>> >>>                            family = gaussian, data = dat2)
>> >>>
>> >>>     }
>> >>>     tmp2(dat2 = dat1)
>> >>> }
>> >>>
>> >>> tmp(myData)
>> >>> tmp1(myData)
>> >>>
>> >>> This works great.  Now load IRanges and do the same
>> >>>
>> >>> library(IRanges)
>> >>>> tmp(myData)
>> >>> Error in terms.formula(formula, data = data) :
>> >>>   'data' argument is of the wrong type
>> >>>
>> >>> Enter a frame number, or 0 to exit
>> >>>
>> >>> 1: tmp(myData)
>> >>> 2: locfit(Postwt ~ lp(Prewt), family = "gaussian", data = dat)
>> >>> 3: eval(m, sys.frame(sys.parent()))
>> >>> 4: eval(m, sys.frame(sys.parent()))
>> >>> 5: eval(expr, envir, enclos)
>> >>> 6: model.frame(formula = Postwt ~ lp(Prewt), data = dat)
>> >>> 7: model.frame.default(formula = Postwt ~ lp(Prewt), data = dat)
>> >>> 8: terms(formula, data = data)
>> >>> 9: terms.formula(formula, data = data)
>> >>>
>> >>> Selection: 0
>> >>>> tmp1(myData)
>> >>> Error in inherits(x, "data.frame") : object 'dat2' not found
>> >>>
>> >>> Enter a frame number, or 0 to exit
>> >>>
>> >>>  1: tmp1(myData)
>> >>>  2: tmp2(dat2 = dat1)
>> >>>  3: locfit(Postwt ~ lp(Prewt), family = gaussian, data = dat2)
>> >>>  4: eval(m, sys.frame(sys.parent()))
>> >>>  5: eval(m, sys.frame(sys.parent()))
>> >>>  6: eval(expr, envir, enclos)
>> >>>  7: model.frame(formula = Postwt ~ lp(Prewt), data = dat2)
>> >>>  8: model.frame.default(formula = Postwt ~ lp(Prewt), data = dat2)
>> >>>  9: is.data.frame(data)
>> >>> 10: inherits(x, "data.frame")
>> >>>
>> >>> Selection: 0
>> >>>
>> >>> Clearly something seems wrong.  The error messages for tmp and tmp1
>> >>> are different.  Perhaps the lacking NAMESPACE for locfit makes a
>> >>> difference.  And locfit sets up the design matrix using a slightly
>> >>> different set of commands, that to my eye mostly look like an older
>> >>> paradigm.
>> >>>
>> >>> Any idea on what happens and why this is affected by IRanges?
>> >>
>> >> It traces to IRanges' promotion of eval to a generic. A workaround is
>> >>
>> >> library(IRanges)
>> >> eval = base::eval
>> >>
>> >> Creating the generic changes the value of 'envir' (the lazy evaluation
>> >> of sys.frame(sys.parent())) in the call, as shown with
>> >>
>> >>> trace(eval, signature=c("ANY", "ANY"), tracer=quote(print(ls(envir))))
>> >>> tmp1(anorexia) ## 'locfit' envir
>> >> Tracing eval(m, sys.frame(sys.parent())) on entry
>> >>  [1] "base"    "cens"    "data"    "formula" "geth"    "i"
>> >> "lfproc"
>> >>  [8] "m"       "subset"  "Terms"   "weights" "z"
>> >> Error in inherits(x, "data.frame") : object 'dat2' not found
>> >>
>> >> verus
>> >>
>> >>> eval = base::eval
>> >>> trace(eval, tracer=quote(print(ls(envir))))
>> >> [1] "eval"
>> >>> tmp1(anorexia) ## 'tmp2' envir
>> >> Tracing eval(m, sys.frame(sys.parent())) on entry
>> >> [1] "dat2"
>> >>
>> >> I don't think there is a robust way around this, but maybe others have
>> >> a
>> >> good idea?
>> >>
>> >>
>> > I just added a non-standard generic for "eval" that forces the
>> > evaluation of
>> > the arguments. Works in this case. Is that sensible?
>>
>> I can't think of an example where this will be positively wrong; it
>> won't surprise me when one is reported. I'm ok with the
>> nonstandardGeneric until then. Martin
>>
>> >
>> > Michael
>> >
>> >
>> >
>> >> Martin
>> >>>
>> >>> Kasper
>> >>>
>> >>> _______________________________________________
>> >>> Bioconductor mailing list
>> >>> Bioconductor at stat.math.ethz.ch
>> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >>> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >>
>> >>
>> >> --
>> >> Martin Morgan
>> >> Computational Biology / Fred Hutchinson Cancer Research Center
>> >> 1100 Fairview Ave. N.
>> >> PO Box 19024 Seattle, WA 98109
>> >>
>> >> Location: Arnold Building M1 B861
>> >> Phone: (206) 667-2793
>> >>
>> >
>>
>>
>> --
>> Martin Morgan
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>
>



More information about the Bioconductor mailing list