[BioC] need to confirm this for GOHyperG

Weiwei Shi helprhelp at gmail.com
Thu Nov 30 21:24:59 CET 2006


Hi,

I have a very customerized GSEA (gene set enrichment analysis) project
and want to try hypergeometric distr as one of my approaches. After
"carefully" reading documents and related discussions from this list,
I think the following assignments are right and hope it to be
confirmed before I put into codes:

N: the universe, or the "urn", which is the number of genes in the
intersection between "union of genes in all GO terms" AND "your chip".

n: the number of the selected x genes (most of time, differentially
expressed genes), reduced by N (which means removing all genes not in
N)

D: success events for a specific GO term in universe, i.e. all genes
in a specific GO term, again reduced by removing all genes not in N.

k: success events in the selected genes for a specific GO term,
reduced by removing all genes not in N.

I think the key part here is, everything defined in a general
hypergeometric distr needs to be reduced by chip. Am I right? Please
correct me if I am wrong.

Thanks.

-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III



More information about the Bioconductor mailing list