[R] A hiccup when using anova on gam() fits.

Martin Maechler maechler at stat.math.ethz.ch
Wed Jul 29 11:14:01 CEST 2009


>>>>> "RT" == Rolf Turner <r.turner at auckland.ac.nz>
>>>>>     on Wed, 29 Jul 2009 11:57:20 +1200 writes:

    RT> I stumbled across a mild glitch when trying to compare the
    RT> result of gam() fitting with the result of lm() fitting.


    RT> The following code demonstrates the problem:

    RT> library(gam)

hmm, you should state -- optimally even in the subject of your
message -- that you are *not* using the modern and well-developed
gam() function from *recommended* package  'mgcv'
but rather gam() from 'gam' package .. which was a 
sensationally good thing -- as (!#?&) Fortran code -- originally
in it's time in the 1980s, and soon with an S interface, 
(then 'S-PLUS' now 'S+')


    RT> x <- rep(1:10,10)
    RT> set.seed(42)
    RT> y <- rnorm(100)
    RT> fit1 <- lm(y~x)
    RT> fit2 <- gam(y~lo(x))

in mgcv:
       fit2 <- gam(y ~ s(x)) 

       {or one of the many possible  'gam(y ~ s(x, ....))' }

    RT> fit3 <- lm(y~factor(x))
    RT> print(anova(fit1,fit2)) # No worries.
    RT> print(anova(fit1,fit3)) # Likewise.
    RT> print(anova(fit2,fit3)) # Throws an error.

## not with  mgcv ..
## but also there, you rather want
fit3. <- gam(y ~ factor(x))# NB: gam(), not lm(), even though we fit linear!
anova(fit2,fit3.) # ... as you want it
## or maybe
anova(fit2,fit3., test = "F")


    RT> print(anova(fit3,fit2)) # ``Works'' but gives negative degrees of  
    RT> freedom and sum of squares.

##MM: ditto for mgcv

What is bad about these negative numbers?
After all they are *differences* of df's and RSS's and correctly
*are* negative ..
but then, I know you know that ...

## NOTE: The last one goes  anova() -> anova.lm() -> anova.lmlist()
##       and then "happens to work" with a gam() fit...
## For that reason, even here,

anova(fit3. , fit2) # << note the "."

## is to be preferred in my view

--
Martin Maechler, ETH Zurich

----------

    RT> Is this evidence of a ``bug''?  Or am I being terribly young and  
    RT> naive to
    RT> expect anova() to work at all in such circumstances?

    RT> The error thrown is:

    RT> Error in anova.glmlist(list(object, ...), test = test) :
    RT> (list) object cannot be coerced to type 'double'

    RT> cheers,

    RT> Rolf Turner




More information about the R-help mailing list