[R] question about update()

Thu May 4 14:56:02 CEST 2023

Hi, Berwin, good to hear from you, and thanks for the detailed comments and suggestion.

Actually, my current experimental code works in the way that you suggest, calling directly lm.fit and glm.fit.  What I am trying to develop is an “improved” version of the code for distribution to other people. Hence I wanted to streamline the code, in particular avoiding branches for each fitting procedure (lm.fit, glm.fit and possibly more).  But I am now considering to drop the idea of the “improved” version, and stick to the direct calls to the fitting functions.

Duncan, thanks for your additional comments. It is true that my original message presented a very simplified picture of the problem, possibly over-simplistic.  If I present the problem in the full version of the code, it would look quite long and messy. If I manage to construct a reasonably simplified version of the code, I shall post the question again.

Best wishes,

Adelchi

> On 4 May 2023, at 11:44, Berwin A Turlach <berwin.turlach using gmail.com> wrote:
> 
> G'day Adelchi,
> 
> hope all is well with you.
> 
> On Thu, 4 May 2023 10:34:00 +0200
> Adelchi Azzalini via R-help <r-help using r-project.org> wrote:
> 
>> Thanks, Duncan. What you indicate is surely the ideal route.
>> Unfortunately, in my case this is not feasible, because the
>> construction of xf and the update call are within an iterative
>> procedure where xf is changed at each iteration, so that the steps 
>> 
>> obj$data <- cbind(obj$data, xf=xf)
>> new.obj <- update(obj, . ~ . + xf)
>> 
>> must be repeated hundreds of times, each with a different xf.
> 
> If memory serves correctly, update() takes the object that is passed to
> it, looks at what the call was that created that object, modifies that
> call according to the additional arguments, and finally executes the
> modified call.
> 
> So there is a lot of manipulations going on in update().  In particular
> it would result each time in a call to lm(), glm() or whatever call was
> used to create the object.  Inside any of these modelling functions a
> lot of symbolic manipulations/calculations are needed too (parsing the
> formula, creating the design matrix and response vector from the parsed
> formula and data frame, checking if weights are used &c).
> 
> If you do the same calculation essentially over and over again, just
> with minor modification, all these symbolic manipulations are just time
> consuming.
> 
> IMHO, you will be better off to bypass update() and just use lm.fit()
> (for which lm() is a nice front-end) and glm.fit() (for which glm() is a
> nice front-end), or whatever routine does the grunt work of fitting the
> model to the data in your application (hopefully, the package creator
> used a set up of XXX.fit() to fit the model, called by XXX() that does
> all the fancy formula handling).
> 
> Cheers,
> 
> Berwin
>