[R] Scaling Matrix in qda() function in MASS package

William Dunlap wdunlap at tibco.com
Thu Aug 24 23:01:37 CEST 2017


If you multiply the data for a certain group by the scaling matrix for that
group, the variance matrix will be the identity.  E.g.,

> z <- qda(iris[-5], grouping=iris$Species)
> zapsmall(var( as.matrix(subset(iris, Species=="virginica", 1:4)) %*%
z$scaling[,,"virginica"] ))
  1 2 3 4
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1
> zapsmall(var( as.matrix(subset(iris, Species=="versicolor", 1:4)) %*%
z$scaling[,,"versicolor"] ))
  1 2 3 4
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Aug 24, 2017 at 12:54 PM, Ranjan Maitra <maitra at email.com> wrote:

> I guess the question that is being asked here is what is the scaling
> matrix that is being returned in the qda object. The help file on qda()
> says:
> ...
> scaling: for each group ‘i’, ‘scaling[,,i]’ is an array which transforms
> observations so that within-groups covariance matrix is spherical.
> ...
>
> This is a bit ambiguous. I tried a few cases (spectral, QR decomposition,
> especially given that it is an upper triangular matrix) but was unable to
> match the result.
>
> Unless someone knows, there is no recourse but to muck through source code.
>
> Btw, I think the following will give the necessary source code:
>
> MASS:::qda.default
>
>
> Hope this helps!
>
> Best wishes,
> Ranjan
>
>
> On Wed, 23 Aug 2017 15:58:30 -0700 Bert Gunter <bgunter.4567 at gmail.com>
> wrote:
>
> > You need to learn how to access code for nonexported methods.  See ? "::"
> >
> > > methods(qda)
> > [1] qda.data.frame* qda.default*    qda.formula*    qda.matrix*
> > see '?methods' for accessing help and source code
> >
> > Shows you that the methods are not exported from the namespace. Hence
> > you need to use the triple colon operator to see their code:
> >
> > > MASS:::qda
> >
> > Once you have the code, I presume this will answer your question.
> >
> > Cheers,
> > Bert
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Wed, Aug 23, 2017 at 2:44 PM, Souradeep Chattapadhyay
> > <soura at iastate.edu> wrote:
> > > Hello,
> > >            I am Souradeep Chattopadhyay and I am a graduate student at
> Iowa
> > > State University Department of Statistics.
> > >
> > > Can anyone please explain the mathematical formulation behind the
> scaling
> > > matrix returned by the qda function in MASS package. I want to
> understand
> > > how this scaling matrix is derived from the inputs given to the qda
> > > function.
> > >
> > > Example Code
> > >
> > > The following example is using the banknote data in the MCLUST package.
> > >
> > > *Code*
> > >
> > > require(MASS)
> > > require(mclust)
> > > data(banknote)
> > > quad<-qda(banknote[,-1], grouping=banknote$Status, method="mle")
> > > quad$scaling
> > >
> > >
> > > Scaling matrix returned by qda for this data is
> > >
> > > , , counterfeit
> > >
> > >                 1                    2              3
> > >  4                     5          6
> > > Length   2.853988  1.069414 -0.05279774  0.750531723 -0.2053821
> 0.6986088
> > > Left        0.000000 -4.208108 -3.04707132 -0.026804815 -0.8644062
> > > -1.1088947
> > > Right      0.000000  0.000000  4.27383763  0.003205759  0.3313675
> 1.3865888
> > > Bottom   0.000000  0.000000  0.00000000  0.917596063 -0.8707772
> 0.7274894
> > > Top         0.000000  0.000000  0.00000000  0.000000000 -2.2041415
> > >  0.6956074
> > > Diagonal 0.000000  0.000000  0.00000000  0.000000000
> 0.0000000-2.1879157
> > >
> > > , , genuine
> > >
> > >                 1                     2               3
> 4
> > >               5               6
> > > Length  2.592911 -1.169164  0.6105339 -0.3614352 -0.2520496 -0.5281743
> > > Left       0.000000  3.027882  2.2392994 -0.2842368 -1.2092325
> 0.6927868
> > > Right    0.000000  0.000000 -3.8684746 -0.3972362 -0.4177546 -0.1062555
> > > Bottom 0.000000  0.000000  0.0000000  1.6376150  1.7274240  0.3969998
> > > Top       0.000000  0.000000  0.0000000  0.0000000  2.3022115
> 0.6318543
> > > Diagonal 0.000000  0.000000  0.0000000  0.0000000  0.0000000  2.4516680
> > >
> > >
> > >
> > > Thanks and Regards
> > >
> > > Souradeep
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Important Notice: This mailbox is ignored: e-mails are set to be deleted
> on receipt. Please respond to the mailing list if appropriate. For those
> needing to send personal or professional e-mail, please use appropriate
> addresses.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list