[R] comparing ancova models

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Wed Dec 13 10:07:32 CET 2000

Matthew Wiener <mcw at ln.nimh.nih.gov> writes:

> I've got what is probably a simple question about comparison of models
> using anova, specifically about the situations in which it's valid.  I
> understand, I think, what's going on when the models are strictly
> nested (as most are in the demo(lm) examples).  My question involves
> what happens when the models aren't strictly nested.
> In my particular case, I'm doing ancovas.  I've got values x and y for
> each point, and also a factor A with four levels.  My first model is just
> y ~ x, with one line.  I'd like to compare this to a model with one line
> for each level of the factor, y ~ A*x.  But my original line is gone now
> -- there's no effect of x overall left in the model, just the 4 sub-models
> -- so this isn't nested in the same sense that, say y ~ a and y ~ a + b
> and y ~ a + b + a:b are when everything is a factor.  (But I think it may
> be the same as part of the birthweight, sex, and age example in the demo,
> where it goes from birthweight ~ sex + age to birthweight ~ sex + age +
> sex:age and you get 2 slopes instead of 1.) 
> What I'm not quite sure of is whether it is valid to compare my two models
> using anova, and if not, how I can compare them.  I am interested
> specifically in whether going to a model with 8 parameters gets
> sufficiently better prediction to be considered significant, so the F-test
> format seems like what I want.  The anova function gives me results with
> no problem, but the fact that I can hand it the two objects doesn't mean
> I'm making sense.

Unless I'm reading you completely wrong, then the two models *are*
nested. One way of deciding nestedness is that parameters of the
smaller model can be obtained by putting restrictions on the
parameters of the larger one, which is certainly the case here since
the smaller model is that all 4 slopes and all 4 intercepts are equal.

The standard parametrization for the four-lines model is in fact that
of one line + "delta values" for intercepts and slopes relative to the
first group. If the delta values are all zero, you get the one-line
model back. (A * x is really A + x + A:x, from which you want to
remove A:x and A. Removing only the interaction term gives you a
parallel lines model.)

So, anova() should handle this nicely.

   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list