[R] Discriminant Function Analysis

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Tue Jul 5 16:29:28 CEST 2005

Dear All

This is more of a statistics question than a question about help for R,
so forgive me.

I am using lda from the MASS package to perform linear discriminant
function analysis.  I have 14 cases belonging to two groups and have
measured each of 37 variables.  I want to find those variables that best
discriminate between the two groups, and I want to visualise that and
create a classification function.  Please note at this stage it is a
proof of concept problem - I realise that I must follow this up with a
much more robust anaylsis involving cross-validation.

1) First problem, I got this error message:
> z <- lda(C0GRP_NA ~ ., dpi30)
Warning message: 
variables are collinear in: lda.default(x, grouping, ...) 

I guess this is not a good thing, however, I *did* get a result and it
discriminated perfectly between my groups.  Can anyone explain what this
means?  Does it invalidate my results?

2) My analysis came up with one discriminant variable.  How do I control
how many are produced?  I currently assume this is the only significant
discriminant variable found.  Can I insist it finds more?

3) More of a tip - when my analysis only finds one significant variable,
what is a good way to visualise this graphically?

4) Can I work out from the coefficients which sub groups of my variable
are better at discriminating than others?  I guess I could simply
perform a t-test first to select the best variables...?

5) How do I turn my discriminant function into a classification
function?  i.e. when I plot the scores for the groups I can see
graphically that all the values for one group are below 0.1 and all the
values for the other group are above 1.  But how do I turn my
discriminant function into a classification function?

Many thanks in advance for your help


More information about the R-help mailing list