[R] Calculating large determinants

Wed Dec 5 10:42:05 CET 2007

>>>>> "MJ" == maj  <maj at stats.waikato.ac.nz>
>>>>>     on Wed, 5 Dec 2007 14:18:23 +1300 (NZDT) writes:

    MJ> I apologise for not including a reproducible example
    MJ> with this query but I hope that I can make things clear
    MJ> without one.

    MJ> I am fitting some finite mixture models to data. Each
    MJ> mixture component has p parameters (p=29 in my
    MJ> application) and there are q components to the
    MJ> mixture. The number of data points is n ~ 1500.

    MJ> I need to select a good q and I have been considering model selection
    MJ> methods suggested in Chapter 6 of
    MJ> @BOOK{mp01,
    MJ> author    = {McLachlan, G. J. and Peel, D.},
    MJ> title     = {Finite Mixture Models},
    MJ> publisher = {Wiley},
    MJ> address   = {New York},
    MJ> year      = {2001}
    MJ> }

    MJ> One of these methods involves an "empirical information
    MJ> matrix" which is the matrix of products of parameter
    MJ> scores at the observation level evaluated at the MLE and
    MJ> summed over all observations. For example a six-component
    MJ> mixture will have 6 - 1 + 29*6 = 179 parameters. So for
    MJ> observation i I form the 179 by 179 matrix of products of
    MJ> scores and sum these up over all 1500-odd observations.

    MJ> Actually it is the log of the determinant of the resultant matrix that I
    MJ> really need rather than the matrix itself. I am seeking advice on what may
    MJ> be the best way to evaluate this log(det()). I have been encountering
    MJ> problems using
    MJ> determinant(SS,logarithm=TRUE)

    MJ> and   eigen(SS,only.values = TRUE)$values

    MJ> shows some negative eigenvalues.

which is a problem?
In that case I guess your problem is that you want to estimate a
positive definite matrix S but your estimate S^ is not quite
positive definite.

Function posdefify() in CRAN package "sfsmisc" provides an old
cheap solution to this problem,
where  nearPD() in package 'Matrix' (based on a donation from
Jens Oehlschlaegel) provides a more sophisticated algorithm for
this problem.

If you really only need the eigenvalues of the "corrected"
matrix, you might want to abbreviate the nearPD() function by
just returning the final 'd' vector of eigenvalues.

    MJ> Advice will be gratefully received!

I'll be glad to hear if and how you'd use one of these two functions.

Martin Maechler, ETH Zurich

    MJ>    Murray Jorgensen