[Rd] Bug in update()? (PR#6902)

berwin at maths.uwa.edu.au berwin at maths.uwa.edu.au
Fri May 21 09:22:00 CEST 2004


Dear all,

I noticed the following while playing around with fitting log-linear
models to contingency tables using R 1.8.1, but the problem also
exists under R 1.9.0.

A reproducible example uses the following contingency table:


> library(MASS)
> data(quine)
> tmp <- with(quine, expand.grid(Eth=levels(Eth), Sex=levels(Sex), 
+                    Lrn=levels(Lrn), Age=levels(Age)))
> n <- nrow(quine)
> quine2 <- with(quine,
+       data.frame(tmp,
+       Count=as.vector(tapply(rep(1,n), list(Eth, Sex, Lrn, Age), sum))))

First fit a saturated model and see which term we can drop:
    > fm <- glm(Count ~ .^4, quine2, family=poisson)
    > drop1(fm, test="Chisq")
    Single term deletions
    
    Model:
    Count ~ .^4
                    Df  Deviance     AIC     LRT Pr(Chi)
    <none>             6.661e-16 148.916                
    Eth:Sex:Lrn:Age  2     0.422 145.338   0.422    0.81

Drop the four-way interaction and check which term we can drop next:
    > fm <- update(fm, . ~ . - Eth:Sex:Lrn:Age)
    > drop1(fm, test="Chisq")
    Single term deletions
    
    Model:
    Count ~ .
           Df Deviance     AIC     LRT  Pr(Chi)   
    <none>      35.893 142.810                    
    Eth     1   36.332 141.248   0.439 0.507811   
    Sex     1   37.238 142.154   1.345 0.246237   
    Lrn     1   37.392 142.308   1.499 0.220842   
    Age     3   49.818 150.735  13.925 0.003009 **
    ---
    Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

All of a sudden there are only the main effects left and all
interaction terms have been removed.  I expected to see at this point
a test for all three-way interactions.  Since I started of with the
saturated model and used update to remove the four-way interaction,
all two-way and three-way interactions should still be in the model.
Or am I missing some subtlety of the "^" and "." operators in model
formulae? 

On the other hand, if we specify the saturated model as follows:

    > fm <- glm(Count ~ Lrn*Eth*Sex*Age, quine2, family=poisson)

then the above steps have the desired effect:

    > drop1(fm, test="Chisq")
    Single term deletions
    
    Model:
    Count ~ Lrn * Eth * Sex * Age
                    Df  Deviance     AIC     LRT Pr(Chi)
    <none>             4.219e-15 148.916                
    Lrn:Eth:Sex:Age  2     0.422 145.338   0.422    0.81
    > fm <- update(fm, . ~ . - Eth:Sex:Lrn:Age)
    > drop1(fm, test="Chisq")
    Single term deletions
    
    Model:
    Count ~ Lrn + Eth + Sex + Age + Lrn:Eth + Lrn:Sex + Eth:Sex + 
        Lrn:Age + Eth:Age + Sex:Age + Lrn:Eth:Sex + Lrn:Eth:Age + 
        Lrn:Sex:Age + Eth:Sex:Age
                Df Deviance     AIC     LRT  Pr(Chi)   
    <none>            0.422 145.338                    
    Lrn:Eth:Sex  1    0.493 143.409   0.071 0.789909   
    Lrn:Eth:Age  2    0.629 141.545   0.207 0.901471   
    Lrn:Sex:Age  2   12.728 153.644  12.306 0.002127 **
    Eth:Sex:Age  3    1.019 139.935   0.597 0.897103   
    ---
    Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

The update command only removed the four-way interaction but left all
the two-way and three-way interaction in the model.  

I looked at update.formula to see what the problem might be, but after
reading the help file of .Internal I stopped looking around.... :)

Cheers,

        Berwin

--please do not edit the information below--

Version:
 platform = i686-pc-linux-gnu
 arch = i686
 os = linux-gnu
 system = i686, linux-gnu
 status = 
 major = 1
 minor = 9.0
 year = 2004
 month = 04
 day = 12
 language = R

Search Path:
 .GlobalEnv, package:MASS, package:methods, package:stats, package:graphics, package:utils, Autoloads, package:base



More information about the R-devel mailing list