[BioC] plotting a CA

aedin culhane aedin at jimmy.harvard.edu
Fri Mar 9 18:49:51 CET 2012


Hi Tim, Aoife and Susan

Sorry Tim, I didn't know that I said not to use made4. When did I say 
this? I may have said I need to update some of the functions as I wrote 
the made4 package many years ago.

Susan, made4 calls ade4 but is designed to convert microarray and other 
Bioconductor data classes into formats that can be input into ade4. It 
calls ade4 (and other) plot functions but with more sensible defaults 
for genomics data (ie it doesn't label all of the objects!).  When I 
implemented the package I did it with Guy and Jean who wrote the paper 
you cited and I wholeheartedly agree with all you say ;-)


However Aoife your code plot(ca(table,suprow=c(4,5))) can't be used for 
what you want.  This will plot rows 4 and 5 as supplementary plots onto 
the plot. These points won't be used in the computation of the analysis 
and thus would provide what you want.  Have a look at these plots

### --------------------------------------------
##  From here, you can copy/paste everything to R
##------------------------------------------------


## Your data... I renamed it, as table is a function in R

codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11,  8, 8, 10, 
7),  ncol=3, dimnames = list(c("gene1","gene2", "gene3", "gene4", 
"gene5"), c("codon1", "codon2","codon3")))

library(ca)
codonCA<-ca(codonData)

## Draw 2 plots, one with results of analysis of all the data,
# the other as you described

par(mfrow=c(1,2))
plot(ca(codonData,suprow=c(4,5)))
plot(codonCA)

## You will notice that the 2 plots are very different,
## one analysis is a CA of all 5 rows, the other is only 3 rows.


## To run a CA on a dataset using made4 or ade4, use the following code

## install made4
## source("http://bioconductor.org/biocLite.R")
## biocLite("made4")

library(made4)

## example dataset
data(khan)
df<-khan$train

## The function ord will run PCA, CA or NSC,
## by default it runs CA (by calling dudi.coa from ade4)

myCA<- ord(df)
plot(myCA)
plotgenes(myCA)
plotarrays(myCA)


## using the ade4 library
library(ade4)
codonCA<-dudi.coa(codonData, scan=FALSE)
scatter(codonCA)


## However neither of these will do exactly as you wish
## made4 expects groups in the column not the rows (genes x samples)

library(made4)
codonCA<-ord(t(codonData))

## Create a factor which list the groups of "nodes" of interest
fac<-factor(c(rep("Node1",3), rep("Node2", 2)))
fac
plot(codonCA, , classvec=fac)



## but the function below will do what you need.


plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, 
plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1,  yax = 2,  ...) {

   require(made4)

   fac2char<-function(fac, newLabels) {
        cLab<- class(newLabels)
        if (!length(levels(fac))==length(newLabels)) stop("Number does 
not equal to number of factor levels")
        vec<-as.character(factor(fac, labels=newLabels))
        if(inherits(newLabels, "numeric")) vec<-as.numeric(vec)
        return(vec)
        }


   if (plotgroups)  s.groups(dudi$li, fac,  col=cols)
   if (!plotgroups) {
     pchs<-fac2char(rowFac, pch)
     cols<-fac2char(rowFac, cols)


     if (!plotrowLabels) s.var(dudi$li, boxes=FALSE, pch=pchs, col=cols, 
cpoint=2, clabel=0, xax=xax, yax=yax,  ...)
     if (plotrowLabels)  s.var(dudi$li, boxes=FALSE, col=cols,  xax=xax, 
yax=yax,  ...)
   }

   s.var(dudi$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE, 
xax=xax, yax=yax,  ...)
}

##--------------------------------------------
## Examples: Function has 3 different options
##-------------------------------------------

library(ade4)
codonCA<-dudi.coa(codonData, scan=FALSE)

## Option 1, plot a biplot (cases and samples) with point
## colored by rowFAC

plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"))

## Option 2. Same plot as above, but with labels rather than points

plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"), 
plotrowLabels=TRUE)

## Option 3, Same plot but put a circle around the groups
## If you look at the help page for s.groups (in made4)
## which calls s.class (in ade4) you will see you can also
## change the size and other details about the
## ellipse (or circle drawn around the groups)

plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue"))




On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty 
<aoife.m.doherty at gmail.com>wrote:

 > Many thanks. I tried this:
 >
 > table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11,
 >    8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1",
 >    "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2",
 >    "codon3")))
 >
 > library(ca)
 >
 > plot(ca(table,suprow=c(4,5)))
 >
 > This will give me a ca plot, where the nodes of interest 4,5 are open
 > circles.
 >
 > However i have two questions.
 >
 > 1. Is it possible instead of manually typing in 4 and 5 to somehow 
get R to
 > read in a list of nodes of interest. Basically is it possible to change:
 >
 > c(4,5) to c(all the nodes that are in a file)
 >
 > and
 >
 > 2. Is it possible instead of the individual nodes of interest being open
 > circles, if the area encompassing all the nodes of interest could be 
shaded
 > differently/highlighted.
 > i THINK this is where your suggestion of:
 >
 > Your best bet is to use the package ade4
 > using res=dudi.coa(data)
 > then
 > s.class(res$li,group)
 > where group is your grouping variable you want to highlight.
 >
 > comes in, but i am completely new at R, i have genuinely tried to
 > understand the packages from the manual, I am confused however.
 >
 > Aoife
 >
 >
 >
 >
 >

-- 
Aedin Culhane
Computational Biology and Functional Genomics Laboratory
Harvard School of Public Health,
Dana-Farber Cancer Institute

web: http://www.hsph.harvard.edu/research/aedin-culhane/
email: aedin at jimmy.harvard.edu
phone: +1 617 632 2468
Fax: +1 617 582 7760


Mailing Address:
Attn: Aedin Culhane, SM822C
450 Brookline Ave.
Boston, MA 02215



More information about the Bioconductor mailing list