[R] splAw: RE: discriminant analysis lda under MASS

David L Carlson dcarlson at tamu.edu
Thu Mar 3 19:06:44 CET 2016

I think the answer is in Venables and Ripley, Modern Applied Statistics with S. 4th Edition, 2002, page 332:

"Fisher (1936) introduced a linear discriminant analysis seeking a linear combination
xa of the variables that has a maximal ratio of the separation of the class
means to the within-class variance, that is, maximizing the ratio aTBa/aTWa.
To compute this, choose a sphering (see page 305) xS of the variables so that
they have the identity as their within-group correlation matrix. On the rescaled
variables the problem is to maximize aTBa subject to a = 1, and as we saw
for PCA, this is solved by taking a to be the eigenvector of B corresponding to
the largest eigenvalue."

Where B is the Between-classes covariance matrix.

David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: Jens Koch [mailto:Jens.Koch at gmx.li] 
Sent: Thursday, March 3, 2016 8:40 AM
To: David L Carlson
Cc: r-help at r-project.org
Subject: splAw: RE: [R] discriminant analysis lda under MASS

Thank you for your answer. Please let me provide additional information:
You have a pooled variance covariance matrix S (2x2). The matrix needs to be inverted: S^-1. 

The calculation of the coeffcients is done by S^-1 (average (x1) - average (x2)). average (x1) is the vector of means (2x1) in group 1, average (x2) is the vector of means (2x1) in group 2.

S is as follows [1,1]: 2267168.12; [1,2]: 49088.07 [2,2]: 8381.77
average (x1) is as follows [1,1]: 3510.4; [2,1]: 390.2
average (x2) is as follows [1,1]: 7975.2; [2,1]: 577.5

Means that having the above mentioned formula in mind it is very easy to calculate.

However, my problem is, that I am not able to find any way to get this result by using R. 

Or in another way: What estimates R and is there any possibility to link the both results?



Gesendet: Donnerstag, 03. März 2016 um 15:08 Uhr
Von: "David L Carlson" <dcarlson at tamu.edu>
An: "Jens Koch" <Jens.Koch at gmx.li>, "r-help at r-project.org" <r-help at r-project.org>
Betreff: RE: [R] discriminant analysis lda under MASS
If the textbook provides the equations, you can work through them directly. But without knowing more, it is hard to say. You could also contact the author of the textbook.

David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jens Koch
Sent: Wednesday, March 2, 2016 9:19 AM
To: r-help at r-project.org
Subject: [R] discriminant analysis lda under MASS

Hello all,

I'd like to run a simple discriminant analysis to jump into the topic with the following dataset provided by a textbook:

Gruppe Einwohner Kosten
1 1642 478,2
1 2418 247,3
1 1417 223,6
1 2761 505,6
1 3991 399,3
1 2500 276
1 6261 542,5
1 3260 308,9
1 2516 453,6
1 4451 430,2
1 3504 413,8
1 5431 379,7
1 3523 400,5
1 5471 404,1
2 7172 499,4
2 9419 674,9
2 8780 468,6
2 5070 601,5
2 5780 578,8
2 8630 641,5

The coefficients according to the textbook need to be -0.00170 and -0.01237.

If I put the data into the lda function under MASS, my result is:

lda(Gruppe ~ Einwohner + Kosten, data = data)

Prior probabilities of groups:
  1   2
0.7 0.3

Group means:
  Einwohner   Kosten
1  3510.429 390.2357
2  7475.167 577.4500

Coefficients of linear discriminants:
Einwohner 0.0004751092
Kosten    0.0050994964

I also tried to solve it by an another software package, but there is also not the result I have expected. I know now, that the solution for the coefficients is standardized by R and the discrimination power is not different at the end of the day.

But: How can I get (calculate) the results printed in the textbook with R?

Thanks in advance,


R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html[http://www.R-project.org/posting-guide.html]
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list