PCA with n << p (was [R] R-1.6.0 crashing on RedHat6.3)

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Tue Oct 29 05:51:05 CET 2002


princomp is the wrong tool here: prcomp is better (and a version using
La.svd would be better still).

What do you want to do with a PCA of such a matrix?  We can almost
certainly give you a better way using La.svd directly.

BDR

On 28 Oct 2002, Peter Dalgaard BSA wrote:

> Douglas Grove <dgrove at fhcrc.org> writes:
>
> > > Are you sure that it is 6.3?? To my knowledge, there is nothing
> > > between 6.2 and 7.0. What's in /etc/redhat-release ?
> >
> > Sorry, I was told it was running 6.3. I just checked and
> > it's running 6.2.
>
> OK, so the Fortran problems are there, but the usual symptom of that
> is crash-on-load.
>
>
> > > The Fortran in RH6.x was rather badly broken for some packages, but
> > > one would expect that you had run into that before. 1.6.0 has a memory
> > > leak but it generally affects repeated applications of model fits,
> > > rather than big matrices.
> > >
> > > Do you really mean 144x5300 ? (more columns than rows) That's big: The
> > > covariance matrix at 5300x5300 will take more than 200 MB (OK, it
> > > might only be storing upper or lower triangle.) I tried a matrix like
> > > that on a 1.6.1beta system with about 0.75 GB and got an out of memory
> > > error. A 144x2500 problem is currently running in
> > >
> > >   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
> > > 16111 pd        17   0  207M 163M 18536 R    99.8 65.8   1:58 R.bin
> > >
> > > and seems to be staying there....
> >
> > Yep, it's 144x5300.  The machine has 2GB of RAM, and this uses about 1.5GB.
>
> Hmm. My half-size toy version conked out with
>
> > D <- matrix(rnorm(2500*144),ncol=2500)
> > library(mva)
> > pc.norm <- princomp(D,scores=FALSE)
> Error in princomp.default(D, scores = FALSE) :
>         covariance matrix is not non-negative definite
>
> which is a bit odd, but at least it didn't run out of memory. However,
> the tolerances seem to require some tweaking!
>
> --
>    O__  ---- Peter Dalgaard             Blegdamsvej 3
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list