[R] How to plot PCA output?

Bryan Hanson hanson at depauw.edu
Mon May 7 16:33:33 CEST 2012


I don't know the answer, Jessica gave some insight.

I avoid the biplot at all costs, because IMHO it violates one of the tenets of good graphic design:  It has two entirely different scales on axes.  These are maximally confusing to the end-user.  So I never use it.

If it is gene expression data, have you looked in Bioconductor for something that will help you?  Maybe runPCA in package EMA?

Bryan

On May 7, 2012, at 9:57 AM, Christian Cole wrote:

> Hi Bryan,
> 
> 
> Many thanks for the replies.
> 
> The data is gene expression data for 36 samples over 11k genes.
> 
> I see that I can plot PC1 vs PC2 by using $x, but compared to biplot() I
> can see that the range of values are different. For example, if I use
> plot() the PC1 scale ranges from -150 to 150 whereas in biplot() it scales
> from -0.4 to 0.4. Do you know what scaling biplot() uses? Does it even
> matter?
> Cheers,
> 
> Chris
> 
> 
> On 07/05/2012 14:36, "Bryan Hanson" <hanson at depauw.edu> wrote:
> 
>> Christian, is that 36 samples x 11K variables?  Sounds like it.  Is this
>> spectroscopic data?
>> 
>> In any case, the scores are in the list element $x as follows:
>> 
>> answer <- prcomp(your matrix)
>> 
>> answer$x contains the scores, so if you want to plot the 1st 2 pcs, you
>> could do
>> 
>> plot(answer$x[,1], answer$x[,2])
>> 
>> Because the columns of answer$x contain the scores of the PCs in order.
>> 
>> [I see Jessica just answered...]
>> 
>> If you want the loading plot, it's going to be interesting with all those
>> variables, but this will do it:
>> 
>> plot(1:11000, answer$rotation[,1], type = "l") # for the loadings of the
>> 1st PC
>> 
>> Depending upon what kind of data this is, the 1:11000 could be replaced
>> by something more sensible.  If it is spectroscopic data, then replace it
>> with your frequency values.
>> 
>> By the way, plot(answer) will give you the scree plot to determine how
>> many PCs are worthy.
>> 
>> Good luck. Bryan
>> 
>> ***********
>> Bryan Hanson
>> Professor of Chemistry & Biochemistry
>> DePauw University
>> 
>> On May 7, 2012, at 6:22 AM, Christian Cole wrote:
>> 
>>> I have a decent sized matrix (36 x 11,000) that I have preformed a PCA
>>> on
>>> with prcomp(), but due to the large number of variables I can't plot the
>>> result with biplot(). How else can I plot the PCA output?
>>> 
>>> I tried posting this before, but got no responses so I'm trying again.
>>> Surely this is a common problem, but  I can't find a solution with
>>> google?
>>> 
>>> 
>>> The University of Dundee is a registered Scottish Charity, No: SC015096
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
> 
> 
> The University of Dundee is a registered Scottish Charity, No: SC015096
> 



More information about the R-help mailing list