[Rd] update forgets about offset() (PR#6656)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Mar 10 10:19:26 MET 2004
On Tue, 9 Mar 2004 Mark.Bravington at csiro.au wrote:
> In R1.7 and above (including R 1.9 alpha), 'update.formula' forgets to copy any offset(...) term in the original '.' formula:
> test> df <- data.frame( x=1:4, y=sqrt( 1:4), z=c(2:4,1))
> test> fit1 <- glm( y~offset(x)+z, data=df)
> test> fit1$call
> glm(formula = y ~ offset(x) + z, data = df)
> test> fit1u <- update( fit1, ~.)
> test> fit1u$call
> glm(formula = y ~ z, data = df)
> The problem occurs when 'update.formula' calls 'terms.formula(..., simplify=TRUE)' which defines and calls a function 'fixFormulaObject'. The first line of 'fixFormulaObject' attempts to extract the contents of the RHS of the formula via
> tmp <- attr(terms(object), "term.labels")
> but this omits any offsets. Replacing that line with the following,
> which I think pulls in everything except the response, *seems* to fix
> the problem without disrupting the guts of 'terms' itself:
> tmp <- dimnames( attr(terms(object), "factors"))[][ -attr( terms, 'response')]
> The suggested line might be simpler than checking the 'offset' component
> of 'terms(object)', which won't always exist.
Sorry, but that is a common programming error. The possible values of
attr(terms, "response") are 0 or 1 (although code should not rely on the
non-existence of 2, 3, ...). foo[-0] == foo is a length-0 vector.
Also, in R please use rownames(): it is easier to read and safer.
> Footnote: strange things happen when there is more than one offset (OK,
> there probably shouldn't be, but I thought I'd experiment):
That is allowed, and works in general.
> test> fit2 <- glm( y ~ offset( x) + offset( log( x)) + z, data=df)
> test> fit2$call
> glm(formula = y ~ offset(x) + offset(log(x)) + z, data = df)
> test> fit2u <- update( fit2, ~.)
> test> fit2u$call
> glm(formula = y ~ offset(log(x)) + z, data = df)
> Curiously, the 'term.labels' attribute of 'terms(object)' now includes the second offset, but not the first.
The issue here is the code to remove offset terms fails if two successive
terms are offsets, but not otherwise.
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel