[R] How to plot PCA output?

Christian Cole C.Cole at dundee.ac.uk
Mon May 7 16:41:54 CEST 2012


Hi Bryan,


On 07/05/2012 15:33, "Bryan Hanson" <hanson at depauw.edu> wrote:

>I don't know the answer, Jessica gave some insight.
>
>I avoid the biplot at all costs, because IMHO it violates one of the
>tenets of good graphic design:  It has two entirely different scales on
>axes.  These are maximally confusing to the end-user.  So I never use it.

I couldn't agree more :)


>If it is gene expression data, have you looked in Bioconductor for
>something that will help you?  Maybe runPCA in package EMA?

I may do, but you've answered my question and I've got a PCA plot that
works.
Many thanks,

Chris


>Bryan
>
>On May 7, 2012, at 9:57 AM, Christian Cole wrote:
>
>> Hi Bryan,
>>
>>
>> Many thanks for the replies.
>>
>> The data is gene expression data for 36 samples over 11k genes.
>>
>> I see that I can plot PC1 vs PC2 by using $x, but compared to biplot() I
>> can see that the range of values are different. For example, if I use
>> plot() the PC1 scale ranges from -150 to 150 whereas in biplot() it
>>scales
>> from -0.4 to 0.4. Do you know what scaling biplot() uses? Does it even
>> matter?
>> Cheers,
>>
>> Chris
>>
>>
>> On 07/05/2012 14:36, "Bryan Hanson" <hanson at depauw.edu> wrote:
>>
>>> Christian, is that 36 samples x 11K variables?  Sounds like it.  Is
>>>this
>>> spectroscopic data?
>>>
>>> In any case, the scores are in the list element $x as follows:
>>>
>>> answer <- prcomp(your matrix)
>>>
>>> answer$x contains the scores, so if you want to plot the 1st 2 pcs, you
>>> could do
>>>
>>> plot(answer$x[,1], answer$x[,2])
>>>
>>> Because the columns of answer$x contain the scores of the PCs in order.
>>>
>>> [I see Jessica just answered...]
>>>
>>> If you want the loading plot, it's going to be interesting with all
>>>those
>>> variables, but this will do it:
>>>
>>> plot(1:11000, answer$rotation[,1], type = "l") # for the loadings of
>>>the
>>> 1st PC
>>>
>>> Depending upon what kind of data this is, the 1:11000 could be replaced
>>> by something more sensible.  If it is spectroscopic data, then replace
>>>it
>>> with your frequency values.
>>>
>>> By the way, plot(answer) will give you the scree plot to determine how
>>> many PCs are worthy.
>>>
>>> Good luck. Bryan
>>>
>>> ***********
>>> Bryan Hanson
>>> Professor of Chemistry & Biochemistry
>>> DePauw University
>>>
>>> On May 7, 2012, at 6:22 AM, Christian Cole wrote:
>>>
>>>> I have a decent sized matrix (36 x 11,000) that I have preformed a PCA
>>>> on
>>>> with prcomp(), but due to the large number of variables I can't plot
>>>>the
>>>> result with biplot(). How else can I plot the PCA output?
>>>>
>>>> I tried posting this before, but got no responses so I'm trying again.
>>>> Surely this is a common problem, but  I can't find a solution with
>>>> google?
>>>>
>>>>
>>>> The University of Dundee is a registered Scottish Charity, No:
>>>>SC015096
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>> The University of Dundee is a registered Scottish Charity, No: SC015096
>>
>
>


The University of Dundee is a registered Scottish Charity, No: SC015096



More information about the R-help mailing list