[R] Cross-validation for Linear Discrimitant Analysis

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Sep 16 06:50:41 CEST 2004


On Wed, 15 Sep 2004, Yu Shao wrote:

> I am new to R and statistics and I have two questions.

Perhaps then you need to start by explaining why you are using LDA.
Please take a good look at the posting guide.

> First I need help to interpret the cross-validation result from the R
> linear discriminant analysis function "lda". 

You mean Professor Ripley's function lda in package MASS, I guess.

> I did the following:
> 
> lda (group ~ Var1 + Var2, CV=T)

R allows you to use meaningful names, so please do so.

> where "CV=T" tells the lda to do cross-validation. The output of lda are
> the posterior probabilities among other things, but I can't find an error
> term (like delta returned by cv.glm). My question is how to get such an
> error term from the output? Can I just simply calculate the prediction
> accuracy using the posterior probabilities from the cross-validation, and
> use that to measure the quality of the model?

cv.glm as in Dr Canty's package boot?  If you are trying to predict
classifications, LDA is not the right tool, and LOO CV probably is not
either.  There is no unique definition of `error term' (true for cv.glm as
well), and people have written whole books about how to assess
classifiers.  LDA is about `discrimination' not `allocation' in the jargon 
used ca 1960.

> Another question is more basic: how to determine if a lda model is
> significant? (There is no p-value.) Thanks,

Please do read the references on the ?lda page.  It's not a useful
question, as LDA is about discriminating between populations and makes the
unrealistic assumption of multivariate normality.  (Analogously for linear
regression, there are ways to test if that is (statistically)
`significant', but knowledgable users almost never do so.)

Perhaps more realistic advice is to suggest you seek some statistical 
consultancy.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list