PCA with n << p (was [R] R-1.6.0 crashing on RedHat6.3)
ripley at stats.ox.ac.uk
Tue Oct 29 05:51:05 CET 2002
princomp is the wrong tool here: prcomp is better (and a version using
La.svd would be better still).
What do you want to do with a PCA of such a matrix? We can almost
certainly give you a better way using La.svd directly.
On 28 Oct 2002, Peter Dalgaard BSA wrote:
> Douglas Grove <dgrove at fhcrc.org> writes:
> > > Are you sure that it is 6.3?? To my knowledge, there is nothing
> > > between 6.2 and 7.0. What's in /etc/redhat-release ?
> > Sorry, I was told it was running 6.3. I just checked and
> > it's running 6.2.
> OK, so the Fortran problems are there, but the usual symptom of that
> is crash-on-load.
> > > The Fortran in RH6.x was rather badly broken for some packages, but
> > > one would expect that you had run into that before. 1.6.0 has a memory
> > > leak but it generally affects repeated applications of model fits,
> > > rather than big matrices.
> > >
> > > Do you really mean 144x5300 ? (more columns than rows) That's big: The
> > > covariance matrix at 5300x5300 will take more than 200 MB (OK, it
> > > might only be storing upper or lower triangle.) I tried a matrix like
> > > that on a 1.6.1beta system with about 0.75 GB and got an out of memory
> > > error. A 144x2500 problem is currently running in
> > >
> > > PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> > > 16111 pd 17 0 207M 163M 18536 R 99.8 65.8 1:58 R.bin
> > >
> > > and seems to be staying there....
> > Yep, it's 144x5300. The machine has 2GB of RAM, and this uses about 1.5GB.
> Hmm. My half-size toy version conked out with
> > D <- matrix(rnorm(2500*144),ncol=2500)
> > library(mva)
> > pc.norm <- princomp(D,scores=FALSE)
> Error in princomp.default(D, scores = FALSE) :
> covariance matrix is not non-negative definite
> which is a bit odd, but at least it didn't run out of memory. However,
> the tolerances seem to require some tweaking!
> O__ ---- Peter Dalgaard Blegdamsvej 3
> c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
More information about the R-help