[BioC] Limma with contrasts

Fri Jan 8 19:39:05 CET 2010

Hi Matthew,

Matthew Willmann wrote:
> Dear list,
> 
> I performed a 9 array Affymetrix microarray experiment with three genotypes and three replicates of each.  I am trying to do GCRMA normalization, filtering based on MAS5 calls, and then use limma with contrasts (and BH to adjust the p-values) to identify differentially expressed genes.  I have no problems with the normalization and filtering, but I am having trouble with the limma aspect, specifically generating the results of the contrasts with BH and writing the results to a table.  In the past, I have only used limma with experiments involving two genotypes.  I have pasted my R console session below.  The table I get from the below commands has four columns, one with the affy IDs and one for each of the three contrasts, but the only entries are 1,0, and -1.  Any advice is much appreciated.
> 
> Thank you.
> 
> 
> Matthew
> 
> 
> -----------------------------------------------------
> Matthew R. Willmann, Ph.D.
> Research Associate, Poethig Lab
> University of Pennsylvania
> Department of Biology
> 433 S. University Avenue
> Philadelphia, PA 19104
> Lab phone: 215-898-8916
> Cell: 508-243-2495
> Fax: 215-898-8780
> 
> 
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
>> dir()
> [1] "Grandcentral (3864) WT1 ATH1.CEL"            
> [2] "Grandcentral (3865) WT2 ATH1.CEL"            
> [3] "Grandcentral (3866) WT3 ATH1.CEL"            
> [4] "Grandcentral (3867) cct1 ATH1.CEL"           
> [5] "Grandcentral (3868) cct2 ATH1.CEL"           
> [6] "Grandcentral (3869) cct3 ATH1.CEL"           
> [7] "Grandcentral (3870) gct1 ATH1.CEL"           
> [8] "Grandcentral (3871) gct2 ATH1.CEL"           
> [9] "Grandcentral (3872) gct3 ATH1.CEL"           
> [10] "R Console.txt"                               
> [11] "R Console2.txt"                              
> [12] "R ConsoleFIRST.txt"                          
> [13] "StewartarraysLIMMAresults_01072010.txt"      
> [14] "StewartarraysLIMMAresults_01072010.txt.fmt"  
> [15] "StewartarraysLIMMAresults_01072010_A.txt"    
> [16] "StewartarraysLIMMAresults_01072010_B.txt"    
> [17] "StewartarraysLIMMAresults_01072010_C.txt"    
> [18] "StewartarraysLIMMAresults_01072010_D.txt"    
> [19] "StewartarraysLIMMAresults_01072010_E.txt"    
> [20] "StewartarraysLIMMAresults_01072010_E.txt.fmt"
> [21] "csgtest.txt"                                 
> [22] "dataStewartgcrma01072010.txt"                
>> library(affy)
> Loading required package: Biobase
> Loading required package: tools
> 
> Welcome to Bioconductor
> 
>  Vignettes contain introductory material. To view, type
>  'openVignette()'. To cite Bioconductor, see
>  'citation("Biobase")' and for packages 'citation(pkgname)'.
> 
> Loading required package: affyio
> Loading required package: preprocessCore
>> library(gcrma)
> Loading required package: matchprobes
> Loading required package: splines
>> library(limma)
>> data.Stewart=RedAffy(filenames=Cel.files)
> Error: could not find function "RedAffy"
>> data.Stewart=ReadAffy(filenames=Cel.files)
> Error in AllButCelsForReadAffy(..., filenames = filenames, widget = widget,  : 
>  object "Cel.files" not found
>> Cel.files=list.files(pattern=".CEL")
>> Cel.files
> [1] "Grandcentral (3864) WT1 ATH1.CEL"  "Grandcentral (3865) WT2 ATH1.CEL" 
> [3] "Grandcentral (3866) WT3 ATH1.CEL"  "Grandcentral (3867) cct1 ATH1.CEL"
> [5] "Grandcentral (3868) cct2 ATH1.CEL" "Grandcentral (3869) cct3 ATH1.CEL"
> [7] "Grandcentral (3870) gct1 ATH1.CEL" "Grandcentral (3871) gct2 ATH1.CEL"
> [9] "Grandcentral (3872) gct3 ATH1.CEL"
>> data.Stewart=ReadAffy(filenames=Cel.files)
>> eset<-gcrma(data,fast=F)
> Error in function (classes, fdef, mtable)  : 
>  unable to find an inherited method for function "indexProbes", for signature "function", "character"
>> eset<-gcrma(data.Stewart,fast=F)
> Adjusting for optical effect.........Done.
> Computing affinities.Done.
> Adjusting for non-specific binding.........Done.
> Normalizing
> Calculating Expression
>> data.pma<-mas5calls(data.Stewart)
> Getting probe level data...
> Computing p-values
> Making P/M/A Calls
>> data.pma.exprs=exprs(data.pma)
>> index.9arrays=grep(".CEL",colnames(data.pma.exprs))
>> numP=apply(data.pma.exprs[,index.9arrays]=="P",1,sum)
>> gene.select=which(numP!=0)
>> length(gene.select)
> [1] 17920
>> data.wk=eset[gene.select,]
>> write.table(data.wk,"csgtest08.txt",sep="\t")

I can't imagine that worked well - unless write.table() has been 
extended, it should be expecting a data.frame or matrix, not an 
ExpressionSet.

>> design <- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3)))
>> colnames(design) <- c("group1", "group2", "group3")
>> fit <- lmFit(eset, design)
>> contrast.matrix <- makeContrasts(group2-group1, group3-group2, group3-group1, levels=design)
>> fit2 <- contrasts.fit(fit, contrast.matrix)
>> fit2 <- eBayes(fit2)
>> results <- decideTests(fit2, method="separate")
>> objects()
> [1] "Cel.files"             "affinity.spline.coefs" "contrast.matrix"      
> [4] "data.Stewart"          "data.pma"              "data.pma.exprs"       
> [7] "data.wk"               "design"                "eset"                 
> [10] "fit"                   "fit2"                  "gene.select"          
> [13] "index.9arrays"         "numP"                  "results"              
>> write.table(results,"StewartarraysLIMMAresults_01082010.txt",sep="\t",col.names=NA)

The results object should just be a matrix containing (1, 0, -1). It can 
be used to make a Venn Diagram, but without further code isn't that 
useful by itself.

<self-promotion>

Above you mention that you want output for each comparison. There are 
several ways to do that, but IMO, the easiest way is to use either 
limma2annaffy() or vennSelect() from the affycoretools package. The 
first function will output either HTML or text (or both) tables with all 
genes that fulfill p-value (or FDR levels in your case) and/or fold 
change criteria. All you need to do is feed in some of the objects you 
have already created.

If you are interested in the output from a Venn Diagram (try the 
following first)

vennDiagram(results)

where you will get 7 tables containing the genes from each part of the 
Venn Diagram, then you would use vennSelect().

</self-promotion>

Alternatively, since you used the "separate" argument to decideTests(), 
you could just use

write.table(topTable(fit2, coef=1), "group2-group1.txt")

to get e.g., the first comparison, switching the coef (and possibly the 
p-value and lfc arguments as you wish) to just get the output from limma.

Best,

Jim

> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-5646
734-936-8662
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues