[R] anova model refinement/clustering question

Murad Nayal mn216 at columbia.edu
Thu Oct 23 08:00:46 CEST 2003



Hi,

I am trying to refine models of a continuous response variable and a
number of categorical predictor variables. I know of some model
refinement tools available in R that help in the selection of model
terms like dropterm and addterm from MASS etc. However, I would also
like to try to refine the model by 'coalescing' some levels of some of
the predictor factors. Is there a standard procedure / R-functions that
will allow me to do this.

This might be naive but I thought that one way to do this is to perform
a pairwise comparison between all levels, say using tukeyHSD, and
coalesce levels that do not have a statistically significant difference
in the average of the response variable between them. so in a way this
becomes a clustering problem. is there a relatively easy way to do this
in R, say short of trying to figure out how to make the relevant
tukeyHSD output look like a dist object and trick hclust into using it. 

I am somewhat of an amateur in the field (and R) and I am probably
making that obvious. any guidance to the 'right' path to approach this
(privately or on the list) is really appreciated.

many thanks
Murad



-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884	Fax: 212-305-6926




More information about the R-help mailing list