[Rd] update forgets about offset() (PR#6656)

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Mar 10 10:19:26 MET 2004


On Tue, 9 Mar 2004 Mark.Bravington at csiro.au wrote:

> In R1.7 and above (including R 1.9 alpha), 'update.formula' forgets to copy any offset(...) term in the original '.' formula:
> 
> test> df <- data.frame( x=1:4, y=sqrt( 1:4), z=c(2:4,1))
> test> fit1 <- glm( y~offset(x)+z, data=df)
> test> fit1$call
> glm(formula = y ~ offset(x) + z, data = df)
> 
> test> fit1u <- update( fit1, ~.)
> test> fit1u$call
> glm(formula = y ~ z, data = df)
> 
> 
> The problem occurs when 'update.formula' calls 'terms.formula(..., simplify=TRUE)' which defines and calls a function 'fixFormulaObject'. The first line of 'fixFormulaObject' attempts to extract the contents of the RHS of the formula via 
> 
> tmp <- attr(terms(object), "term.labels")
> 
> but this omits any offsets. Replacing that line with the following,
> which I think pulls in everything except the response, *seems* to fix
> the problem without disrupting the guts of 'terms' itself:
> 
> tmp <- dimnames( attr(terms(object), "factors"))[[1]][ -attr( terms, 'response')]
> 
> The suggested line might be simpler than checking the 'offset' component
> of 'terms(object)', which won't always exist.

Sorry, but that is a common programming error.  The possible values of
attr(terms, "response") are 0 or 1 (although code should not rely on the 
non-existence of 2, 3, ...).  foo[-0] == foo[0] is a length-0 vector.

Also, in R please use rownames(): it is easier to read and safer.

> Footnote: strange things happen when there is more than one offset (OK,
> there probably shouldn't be, but I thought I'd experiment):

That is allowed, and works in general.

> test> fit2 <- glm( y ~ offset( x) + offset( log( x)) + z, data=df)
> test> fit2$call
> glm(formula = y ~ offset(x) + offset(log(x)) + z, data = df)
> 
> test> fit2u <- update( fit2, ~.)
> test> fit2u$call
> glm(formula = y ~ offset(log(x)) + z, data = df)
> 
> Curiously, the 'term.labels' attribute of 'terms(object)' now includes the second offset, but  not the first.

The issue here is the code to remove offset terms fails if two successive 
terms are offsets, but not otherwise.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list