[R] logistic regression or discriminant analysis ?

Jonathan Baron baron at cattell.psych.upenn.edu
Fri May 24 13:08:42 CEST 2002

On 05/24/02 10:49, Daniel Amorèse wrote:
>I have 2 groups (perhaps 3, if I subdivide a group into 2) of
>data. These data are described by at least 15 parameters.
>What I want to do: from these 15 variables, I want to get the
>subset providing the largest distance between groups.
>This is why a stepwise approach interests me. AM I WRONG ?
>My first purpose is not to predict group membership because
>I think other parameters (not available, hard to measure) can 
>(slightly, I hope) modify the group membership. Thus, I am
>not interested in establishing an accurate discrimination
>rule. I just want to know, among the 15 variables, the subset
>being more likely to participate to the discrimination.

It sounds to me as though cluster analysis would be helpful,
either hierarchical (e.g., hclust) or kmeans.  There are at least
two relevant R packages: cluster and mva.  I don't think that any
method has an automatic stepwise add-on, but it also sounds like
you could do the stepwise part "by hand."  Most clustering give
you the within-cluster variance of each cluster.  And, once you
have the clusters, you could use aov, perhaps even with stepAIC,
to find the best predictors.  If you do it by hand you could keep
removing the least useful variable and observe its effect on the
goodness of the clustering.

Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list