[BioC] GAGE analysis

Luo Weijun luo_weijun at yahoo.com
Fri Dec 27 17:46:25 CET 2013


Roopa,
GAGE can be used for various high throughput datasets, microarray, RNA-Seq, proteomics and metabolomics data etc. For proteomics and metabolomics data with a few thousands or a few hundreds molecules (genes/proteins/metolites), it is fine that the total number of molecules is not as big as microarray or RNA-Seq (usually tens of thousands). The main concern is small-size pathways (gene sets) may not have enough presence in the data, hence their test results become irreliable or meaningless. GAGE set the testable effective gene set size to be 10 genes/molecules or more. I’ve applied GAGE to metabolomics datasets of a few hundred metabolites and still got sensible results for a good number of pathways.
You may use Entrez Gene ID, official gene symbols or other types of gene (or molecule) IDs, but please make sure that the gene IDs in your data and your gene sets/pathways are the same type. Please go through the gage vignette:
http://bioconductor.org/packages/release/bioc/vignettes/gage/inst/doc/gage.pdf 
Note that the pathway analysis results can be visualized using pathview:
http://bioconductor.org/packages/release/bioc/html/pathview.html
HTH
Weijun



--------------------------------------------
On Thu Dec 26, Roopa Subbaiaih wrote:

Hi all,

I was looking for good packages which does gene enrichment analysis. I came
across GAGE which provides good analysis. My question is can it take up
proteomics data too or is it designed for microarray data sets only. The
data which I would like to analyze  looks like this-

 Symbol Unaff (FC) Aff(FC)  XIST 639.271 890.0833  C7orf63 261.7478 93.08333
SERPINB6 156.641 136.84  LINC00273 130.3398 60.60667  LOC401497 103.14
175.8533  ARHGEF12 94.00556 96.20667  TSIX 83.19314 80.91333  POM121L8P
80.03056 52.72
The first column comprises symbols while the rest of the columns are fold
changes ( diseased/N) for diff sample sets ( Species=Human). The total
number of differential expressed genes are around 1200.

Please advice.

Thanks, Roopa



More information about the Bioconductor mailing list