[R] LDA once again

Edoardo M Airoldi eairoldi at stat.cmu.edu
Sun May 25 07:15:50 CEST 2003


hi there,
 i have one more question about LDA.  just to make surei understand,
suppose we have two classes, then if i specify a prior=c(.3,.7) in
lda(...) this will affect my between classes covariance matrix as in:

 SB = (.3*m1 - .7*m2) %*% inv(Sigma) %*% t(.3*m1 - .7*m2)

 [is Sigma affected ?] and the threshold to decide which class to assign
'test' data = log(.3/.7)

 if i specify a prior=c(.2,.8) in predict(...), but not in lda(...)  then
SB will not be affected, but and the threshold to decide which class to
assign to my 'test' data will be at log(.8/.2)


                        --- --- --- manual --- --- ---
Details:

     The function tries hard to detect if the within-class covariance
     matrix is singular. If any variable has within-group variance less
     than `tol^2' it will stop and report the variable as constant. 
     This could result from poor scaling of the problem, but is more
     likely to result from constant variables.

     Specifying the `prior' will affect the classification unless
     over-ridden in `predict.lda'. Unlike in most statistical packages,
     it will also affect the rotation of the linear discriminants
     within their space, as a weighted between-groups covariance matrix
     is used. Thus the first few linear discriminants emphasize the
     differences between groups with the weights given by the prior,
     which may differ from their prevalence in the dataset.




More information about the R-help mailing list