[Rd] robust updating methods

Thierry Onkelinx thierry.onkelinx at inbo.be
Sat Mar 28 12:01:18 CET 2015


Dear Ben,

Fitting the model and calculating the confidence intervals within the same
function works (
https://github.com/ThierryO/testlme4/blob/master/tests/testthat/test_fit_model_ci.R
passes).
Fitting the model inside a function and calculating the confidence
intervals on the output still fails (
https://github.com/ThierryO/testlme4/blob/master/tests/testthat/test_fit_model.R
fails).

Directly calculating the confidence intervals in the same function is an
acceptable solution for my work. Thank you for this solution.

Best regards,

Thierry

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-03-27 22:36 GMT+01:00 Ben Bolker <bbolker at gmail.com>:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>   [Sorry to those who don't like it for top-posting]
>
>   Thierry, I'm curious whether this addresses your problem (although
> we don't have a hard timetable for the next release [it has to avoid
> conflicts with the 3.2.0 release in 2.5 weeks at the very least], so
> this might be problematic if your package needs to depend on it).
>
>   I'm still curious whether there are any ideas/opinions from other
> readers.  Has anyone else struggled with this?  Is there a canonical
> solution?
>
>  Ben Bolker
>
>
> On 15-03-24 07:55 PM, Ben Bolker wrote:
> > On 15-03-23 12:55 PM, Thierry Onkelinx wrote:
> >> Dear Ben,
> >
> >> Last week I was struggling with incorporating lme4 into a
> >> package. I traced the problem and made a reproducible example (
> >> https://github.com/ThierryO/testlme4).  It looks very simular to
> >> the problem you describe.
> >
> >> The 'tests' directory contains the reproducible examples.
> >> confint() of a model as returned by a function fails. It even
> >> fails when I try to calculate the confint() inside the same
> >> function as the glmer() call (see the fit_model_ci function).
> >
> >> Best regards,
> >
> >> Thierry
> >
> >
> > Ugh.  I can get this to work if I also try searching up the call
> > stack, as follows (within update.merMod).  This feels like "code
> > smell" to me though -- i.e., if I have to hack this hard I must be
> > doing something wrong/misunderstanding how the problem *should* be
> > done.
> >
> >
> > if (evaluate) { ff <- environment(formula(object)) pf <-
> > parent.frame()  ## save parent frame in case we need it sf <-
> > sys.frames()[[1]] tryCatch(eval(call, env=ff), error=function(e) {
> > tryCatch(eval(call, env=sf), error=function(e) { eval(call, pf) })
> > }) } else call
> >
> > Here is an adapted even-more-minimal version of your code, which
> > seems to work with the version of update.merMod I just pushed to
> > github, but fails for glm():
> >
> >
> > ##
> > https://github.com/ThierryO/testlme4/blob/master/R/fit_model_ci.R
> > fit_model_ci <- function(formula, dataset, mfun=glmer){ model <-
> > mfun( formula = formula, data = dataset, family = "poisson" ) ci <-
> > confint(model) return(list(model = model, confint = ci)) }
> >
> > library("lme4") set.seed(101) dd <-
> > data.frame(f=factor(rep(1:10,each=100)), y=rpois(2,1000))
> > fit_model_ci(y~(1|f),dataset=dd)
> > fit_model_ci(y~(1|f),dataset=dd,mfun=glm)
> >
> >
> >
> >
> >
> >> ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek /
> >> Research Institute for Nature and Forest team Biometrie &
> >> Kwaliteitszorg / team Biometrics & Quality Assurance
> >> Kliniekstraat 25 1070 Anderlecht Belgium
> >
> >> To call in the statistician after the experiment is done may be
> >> no more than asking him to perform a post-mortem examination: he
> >> may be able to say what the experiment died of. ~ Sir Ronald
> >> Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner
> >> The combination of some data and an aching desire for an answer
> >> does not ensure that a reasonable answer can be extracted from a
> >> given body of data. ~ John Tukey
> >
> >> 2015-03-22 17:45 GMT+01:00 Ben Bolker <bbolker at gmail.com>:
> >
> >> WARNING: this is long.  Sorry I couldn't find a way to compress
> >> it.
> >
> >> Is there a reasonable way to design an update method so that it's
> >>  robust to a variety of reasonable use cases of generating calls
> >> or data inside or outside a function?  Is it even possible?
> >> Should I just tell users "don't do that"?
> >
> >> * `update.default()` uses `eval(call, parent.frame())`; this
> >> fails when the call depends on objects that were defined in a
> >> different environment (e.g., when the data are generated and the
> >> model initially fitted within a function scope)
> >
> >> * an alternative is to store the original environment in which
> >> the fitting is done in the environment of the formula and use
> >> `eval(call, env=environment(formula(object)))`; this fails if the
> >>  user tries to update the model originally fitted outside a
> >> function with data modified within a function ...
> >
> >> * I think I've got a hack that works below, which first tries in
> >> the environment of the formula and falls back to the parent
> >> frame if that fails, but I wonder if I'm missing something much
> >> simpler ..
> >
> >> Thoughts?  My understanding of environments and frames is still,
> >>  after all these years, not what it should be ...
> >
> >> I've thought of some other workarounds, none entirely
> >> satisfactory:
> >
> >> * force evaluation of all elements in the original call *
> >> printing components of the call can get ugly (can save the
> >> original call before evaluating) * large objects in the call get
> >> duplicated * don't use `eval(call)` for updates; instead try to
> >> store everything internally * this works OK but has the same
> >> drawback of potentially storing large extra copies * we could try
> >> to use the model frame (which is stored already), but there are
> >> issues with this (the basis of a whole separate rant) because the
> >> model frame stores something in between predictor variables and
> >> input variables. For example
> >
> >> d <- data.frame(y=1:10,x=runif(10))
> >> names(model.frame(lm(y~log(x),data=d))) ## "y" "log(x)"
> >
> >> So if we wanted to do something like update to "y ~ sqrt(x)", it
> >> wouldn't work ...
> >
> >> ================== update.envformula <- function(object,...) {
> >> extras <- match.call(expand.dots = FALSE)$... call <-
> >> getCall(object) for (i in names(extras)) { existing <-
> >> !is.na(match(names(extras), names(call))) for (a in
> >> names(extras)[existing]) call[[a]] <- extras[[a]] if
> >> (any(!existing)) { call <- c(as.list(call), extras[!existing])
> >> call <- as.call(call) } } eval(call,
> >> env=environment(formula(object))) ## enclos=parent.frame()
> >> doesn't help }
> >
> >> update.both <- function(object,...) { extras <-
> >> match.call(expand.dots = FALSE)$... call <- getCall(object) for
> >> (i in names(extras)) { existing <- !is.na(match(names(extras),
> >> names(call))) for (a in names(extras)[existing]) call[[a]] <-
> >> extras[[a]] if (any(!existing)) { call <- c(as.list(call),
> >> extras[!existing]) call <- as.call(call) } } pf <-
> >> parent.frame() ## save parent frame in case we need it
> >> tryCatch(eval(call, env=environment(formula(object))),
> >> error=function(e) { eval(call, pf) }) }
> >
> >> ### TEST CASES
> >
> >> set.seed(101) d <- data.frame(x=1:10,y=rnorm(10)) m1 <-
> >> lm(y~x,data=d)
> >
> >> ##' define data within function, return fitted model f1 <-
> >> function() { d2 <- d lm(y~x,data=d2) return(lm(y~x,data=d2)) }
> >> ##' define (and modify) data within function, try to update ##'
> >> model fitted elsewhere f2 <- function(m) { d2 <- d; d2[1] <-
> >> d2[1]+0 ## force copy update.default(m,data=d2) } ##' define (and
> >> modify) data within function, try to update ##' model fitted
> >> elsewhere (use envformula) f3 <- function(m) { d2 <- d; d2[1] <-
> >> d2[1]+0 ## force copy update.envformula(m,data=d2) }
> >
> >> ##' hack: first try the formula, then the parent frame ##' if
> >> that doesn't work for any reason f4 <- function(m) { d2 <- d;
> >> d2[1] <- d2[1]+0 ## force copy update.both(m,data=d2) }
> >
> >> ## Case 1: fit within function m2 <- f1()
> >> try(update.default(m2)) ## default: object 'd2' not found m3A <-
> >> update.envformula(m2)  ## envformula: works m3B <-
> >> update.both(m2)        ## works
> >
> >> ## Case 2: update within function m4A <- f2(m1)  ## default:
> >> works try(f3(m1))    ## envformula: object 'd2' not found m4B <-
> >> f4(m1) ## works
> >
> >>>
> >>> ______________________________________________
> >>> R-devel at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >
> >
> >
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
>
> iQEcBAEBAgAGBQJVFc1CAAoJEOCV5YRblxUHF+MH/3Y9uFZFolhx5b5jWSyXwQgp
> i9oawx4K6il0qiAiDiO5D7NSSdc0u9jlgj8NjH0G2O9u3ctpvcYNVwa7cP9288Xz
> xRyInnnh2FIpT6W0XyzJDivw5EX3IkyYuv6eDNqVyGcYXkvzJMA+vwMMWdGWEZbL
> jKtDc0trG+9yJnwIi6DW6IQWPovrDaNxEinS+V7+DmYACQvJ4P2kg2u/ZsxAx89q
> mcA1pS5usJjkOiQwBVUvV7l2UKNhHPFNwbBK1QdHgpP7PTdB52EQr+IyERhpf56s
> 8tYyNbSSPWoG9vt6/1pKyUK4iNRBtGgxtuozAv5OUjF8VGWGwUXBLo5G2yrBbs4=
> =o1PJ
> -----END PGP SIGNATURE-----
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list