[BioC] edgeR - multiple comparisions

Sun May 22 22:24:19 CEST 2011

Hi Sridhara,

The problem here is that the output of topTags() (your 'fdr06') is not a data.frame or matrix, which is what write.table() works best on. Instead, try:

fdr06 <- topTags(de06.tgw, n = nrow(de06.tgw), adjust.method = "BH", sort.by="p.value")
write.table(fdr06$table, file = "FDR06.csv", sep=",")

Cheers,
Mark

On May 22, 2011, at 11:02 PM, Sridhara Gupta Kunjeti wrote:

> Hello Mark,
> Thanks for your email. I have one quick question. Is it possible to export all the 10,427 genes after passing exactTest()? what argument do I need to use to do that? Basically I wanted the complete list of genes with the following info:
> > topTags(de06.tgw, n = 10, adjust.method="BH", sort.by="p.value")
> Comparison of groups: T6-P18 
>                                                                     logConc      logFC       PValue          FDR
> PITG_08841 | Pi conserved hypothetical protein (129 nt)           -28.79463  42.442850 1.032735e-11 1.076833e-07
> PITG_08845 | Pi mannitol dehydrogenase, putative (1065 nt)        -12.93992   9.148329 1.288618e-09 6.193586e-06
> 
> If I use the following argument, it is showing an error message.
> 
> fdr06<- topTags(de06.tgw, n = 10,427, adjust.method = "BH", sort.by="p.value")
> write.table(fdr06, file = "FDR06.csv", sep=",", col.names = NA, qmethod="double")
> Error in data.frame(table = list(logConc = c(-28.7946, -12.93992, : arguments imply differing number of rows: 10427, 1, 2
>  
> If I do the same with n = 10426, it is executinig without any error. Except that I am missing one row.
>  
> Any suggetions on how to export all the columns for all the rows will be a great help.
>  
> Many thanks!
> Sridhara
>  
> 
> 
> 
> On Sun, May 22, 2011 at 5:34 AM, Mark Robinson <mrobinson at wehi.edu.au> wrote:
> Hi Sridhara,
> 
> If you haven't already, you might have a solid read of the edgeR user's guide, it has answers to some of your questions.
> 
> 
> On May 21, 2011, at 11:20 PM, Sridhara Gupta Kunjeti wrote:
> 
> > Hello,
> > I have used edgeR for DGE analysis and I have few questions regarding the
> > model and comparisions.
> >
> > 1) What kind of statistical model is taken into account to analyze treatment
> > structure and conduct analysis of variance?
> 
> For the example you show below (a 2-group comparison), the 'Negative binomial models' Section in the user's guide covers this.  Of course, the package has facility for more complicated "treatment structure" through generalized linear models (See the 'Experiment with multiple factors' Section, for example).
> 
> 
> > 2) How does the edgeR correct the multiple comparisions?
> 
> See ?topTags; its also mentioned in the user's guide.
> 
> ----
>     topTags(object, n=10, adjust.method="BH", sort.by="p.value")
> ...
> adjust.method: character string stating the method used to adjust
>          p-values for multiple testing, passed on to ‘p.adjust’
> ...
> ----
> 
> 
> > 3) I am assuming that the calculated  p-values in the output after
> > performing the tagwiseDispersion are after adjusting for multiple testing.
> > Please correct me if I am wrong? If so, what kind of multiple testing is
> > taken into account?
> 
> exactTest() doesn't do the multiple testing correction, but topTags() does.
> 
> HTH,
> Mark
> 
> 
> >
> > The arguments that I passed are as follows:
> >> raw.data <- read.delim("c33_con3.txt")
> >> raw.data.2a <- read.delim ("2c33_con3.txt")
> >> d2a <- raw.data.2a[, 2:5]
> >> rownames(d2a) <- raw.data.2a[,1]
> >> group2a <- c(rep("c33", 2), rep("con3", 2))
> >> d2a <- DGEList(counts = d2a, group = group2a)
> >> d2a <- estimateCommonDisp(d2a)
> >> d2a <- estimateTagwiseDisp(d2a, prior.n = 10, grid.length = 500)
> >> prior.n2a <- estimateSmoothing(d2a)
> >> de2a.tgw <- exactTest(d2a, common.disp = FALSE)
> >> de2a.tgw
> > An object of class "DGEExact"
> > $table
> >
> > logConc       logFC   p.value
> > MGG_00005 | Mo hypothetical protein (1014 nt)
> > -16.67772  0.05248378 0.9394668
> > MGG_00015 | Mo catechol O-methyltransferase (1102 nt)
> > -14.68066  0.36189877 0.2786389
> > MGG_00016 | Mo 2-epi-5-epi-valiolone synthase (1739 nt)
> > -13.50677  0.32379041 0.3759259
> > MGG_00017 | Mo L-aminoadipate-semialdehyde dehydrogenase (3472 nt) -14.28686
> > -0.35747999 0.3040601
> > MGG_00018 | Mo integral membrane protein (2504 nt)
> > -14.56791  0.45187243 0.1701996
> > 11452 more rows ...
> > $comparison
> > [1] "c33"  "con3"
> > $genes
> > NULL
> >
> >
> >> sessionInfo()
> > R version 2.12.1 (2010-12-16)
> > Platform: i386-pc-mingw32/i386 (32-bit)
> > locale:
> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> > States.1252    LC_MONETARY=English_United States.1252
> > [4] LC_NUMERIC=C                           LC_TIME=English_United
> > States.1252
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> > other attached packages:
> > [1] edgeR_2.0.3
> > loaded via a namespace (and not attached):
> > [1] limma_3.6.9  tools_2.12.1
> >
> > I would really appreciate your comments or suggestions.
> >
> > Many thanks!
> >
> > Sridhara
> >
> > --
> > Sridhara G Kunjeti
> > PhD Candidate
> > University of Delaware
> > Department of Plant and Soil Science
> > email- sridhara at udel.edu
> > Ph: 832-566-0011
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> ------------------------------
> Mark Robinson, PhD (Melb)
> Epigenetics Laboratory, Garvan
> Bioinformatics Division, WEHI
> e: mrobinson at wehi.edu.au
> e: m.robinson at garvan.org.au
> p: +61 (0)3 9345 2628
> f: +61 (0)3 9347 0852
> ------------------------------
> 
> 
> ______________________________________________________________________
> The information in this email is confidential and intended solely for the addressee.
> You must not disclose, forward, print or use it without the permission of the sender.
> ______________________________________________________________________
> 
> 
> 
> -- 
> Sridhara G Kunjeti
> PhD Candidate
> University of Delaware
> Department of Plant and Soil Science
> email- sridhara at udel.edu
> Ph: 832-566-0011

------------------------------
Mark Robinson, PhD (Melb)
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: mrobinson at wehi.edu.au
e: m.robinson at garvan.org.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
------------------------------

______________________________________________________________________
The information in this email is confidential and intended solely for the addressee.
You must not disclose, forward, print or use it without the permission of the sender.