[BioC] questions about gage

Luo Weijun luo_weijun at yahoo.com
Fri Aug 29 19:58:33 CEST 2014


First, You may want to read a few similar questions on GAGE, which explain how GAGE works:
http://seqanswers.com/forums/showthread.php?p=148507#4
You may always choose to use the native GAGE/Pathview workflow, then the joint workflow (with other tools like DESeq). The former is more powerful, the latter exist for users’ convenience.
Small sample size is common for current RNA-seq datasets, which raise statistical concern for differential expression analysis in general in such condition: http://www.biomedcentral.com/1471-2105/14/91. In this sense, such p-value or test statistics could be less robust than fold changes for differential expression score. Having that said, you may always choose to use differential expression statistics other than fold change (section 5 of the tutorial). And you may always compare the effect of using different per gene scores/statistics as in section 5.
It is not likely to generate false positive no matter you use fold change or other test statistics in GAGE analysis given that GAGE test the mean of tens or hundreds of genes in a gene set or pathway against the background of all genes. In the meantime, GAGE does FDR control to exclude false positives. 


On 8/28/2014, Chun wrote:> Hi Dr. Luo,
> 
> Hope this email finds you well. Recently I tried to use your RNAseq 
> pipeline to analyze our data. Could you please help me to clarify two 
> questions?
> 
> Currently I am trying to use DESeq2 first to get list of log ratio 
> changes and then feed into gage (as described in your vegnettes “RNA-seq 
> data pathway and gene-set analysis workflows, 6.1”, . In this case, we 
> will lose statistics information for all genes, right? Those genes with 
> high fold change but small p-value (from DESeq2) could lead to false 
> discoveries of the enriched gene sets. And I am a little bit confused 
> about how do you calculate p-val and q-val for different sets in this 
> case. We will not even have 1-on-1 comparison. How do you determine the 
> statistics for each gene?
> 
> Chun
>



More information about the Bioconductor mailing list