[R] Parameterization puzzle

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Jul 21 08:20:00 CEST 2006


R does not know that poly(age,2) and poly(age,1) are linearly dependent.
(And indeed they only are for some functions 'poly'.)

I cannot reproduce your example ('l' is missing), but perhaps

glm(deaths ~ poly(age,2) + poly(age,1)*Smoke + offset(l), poisson)

was your intention?

On Fri, 21 Jul 2006, Murray Jorgensen wrote:

> Consider the following example (based on an example in Pat Altham's GLM 
> notes)
> 
> pyears <- scan()
> 18793 52407 10673 43248 5710 28612 2585 12663 1462 5317
> 
> deaths <- scan()
> 2 32 12 104 28 206 28 186 31 102
> 
> Smoke <- gl(2,1,10,labels=c("No","Yes"))
> Age <- gl(5,2,10,labels=c("35-44","45-54","55-64","65-74","75-84"),
>             ordered=TRUE)
> mod1.glm <- glm(deaths ~ Age * Smoke + offset(l),family=poisson)
> summary(mod1.glm)
> age <- as.numeric(Age)
> mod2.glm <- aso1.glm <- glm(deaths ~ poly(age,2) + Smoke +
>                        poly(age,1):Smoke + offset(l),family=poisson)
> summary(mod2.glm)
> 
> 
> 
> The business part of the summary for the first model
> 
>                 Estimate Std. Error z value Pr(>|z|)
> (Intercept)    -5.92706    0.16577 -35.754  < 2e-16 ***
> Age.L           4.06490    0.47414   8.573  < 2e-16 ***
> Age.Q          -1.08293    0.41326  -2.620 0.008781 **
> Age.C           0.24158    0.31756   0.761 0.446816
> Age^4           0.04244    0.23061   0.184 0.853986
> SmokeYes        0.61916    0.17296   3.580 0.000344 ***
> Age.L:SmokeYes -1.31234    0.49267  -2.664 0.007729 **
> Age.Q:SmokeYes  0.39043    0.43008   0.908 0.363976
> Age.C:SmokeYes -0.29593    0.33309  -0.888 0.374298
> Age^4:SmokeYes -0.03682    0.24432  -0.151 0.880218
> 
> inspires me to fit the second model that omits the nonsignificant terms, 
> however this produces the summary
> 
>                        Estimate Std. Error z value Pr(>|z|)
> (Intercept)            -5.8368     0.1213 -48.103  < 2e-16 ***
> poly(age, 2)1           3.9483     0.1755  22.497  < 2e-16 ***
> poly(age, 2)2          -1.0460     0.1448  -7.223 5.08e-13 ***
> SmokeYes                0.5183     0.1262   4.106 4.02e-05 ***
> SmokeNo:poly(age, 1)    1.3755     0.4340   3.169  0.00153 **
> SmokeYes:poly(age, 1)       NA         NA      NA       NA
> 
> Why do we get a SmokeNo:poly(age, 1) term? Can I re-express mod2.glm so 
> that this term does not appear?
> 
> Cheers,  Murray Jorgensen
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list