[R] comparing ancova models: summary

Matthew Wiener mcw at ln.nimh.nih.gov
Wed Dec 13 20:00:31 CET 2000

Thanks to John Fox, Brian Ripley, and Peter Dalgaard for responding.
The short answer (as in Peter Dalgaard's reply, already posted to the
list) is that the models I'm concerned with can in fact be compared using
ancova.  The key fact is that while the parameters may not be nested, the
subspaces I'm examining are.

An additional note from Prof. Ripley on AIC and BIC (which I quote in
order to avoid misreporting):

AIC and (the several varieties of) BIC have different purposes and
differnt approximations.  AIC is designed to find a model that is large
enough for good predictions, so tends to include terms that might be
useful.  BIC is based on some gross approximations to finding the most
supported model, and the various versions differ in how gross the
approximations are (although a series in log n effectively does not
converge even for data-mining-sized datasets). 

Thanks again for all the help.

Matt Wiener

original message:

Hello, all.

I've got what is probably a simple question about comparison of models
using anova, specifically about the situations in which it's valid.  I
understand, I think, what's going on when the models are strictly
nested (as most are in the demo(lm) examples).  My question involves
what happens when the models aren't strictly nested.

In my particular case, I'm doing ancovas.  I've got values x and y for
each point, and also a factor A with four levels.  My first model is just
y ~ x, with one line.  I'd like to compare this to a model with one line
for each level of the factor, y ~ A*x.  But my original line is gone now
-- there's no effect of x overall left in the model, just the 4 sub-models
-- so this isn't nested in the same sense that, say y ~ a and y ~ a + b
and y ~ a + b + a:b are when everything is a factor.  (But I think it may
be the same as part of the birthweight, sex, and age example in the demo,
where it goes from birthweight ~ sex + age to birthweight ~ sex + age +
sex:age and you get 2 slopes instead of 1.) 

What I'm not quite sure of is whether it is valid to compare my two models
using anova, and if not, how I can compare them.  I am interested
specifically in whether going to a model with 8 parameters gets
sufficiently better prediction to be considered significant, so the F-test
format seems like what I want.  The anova function gives me results with
no problem, but the fact that I can hand it the two objects doesn't mean
I'm making sense.

I've also looked at the Aikake and Bayes information criteria for this,
which give different results from one another.  (Assuming that what I
should look for is just a decrease to mean better prediction.) 

Any advice and/or references appreciated.


Matt Wiener
Laboratory of Neuropsychology
Bethesda, MD 20892
301-496-5625 x254
mcw at ln.nimh.nih.gov

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list