[BioC] testing GO categories with Fisher's exact test.

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Tue Feb 24 12:51:16 MET 2004


Surely, at the point where you are seeing "lots" of eg apoptosis genes in your cluster, 
drop the statistics and start the biology?  

Remember the ultimate proof that of any statistical sense is that it makes biological sense and is biologically validated.  Do we really need to know if an annotation is significant??

-----Original Message-----
From: Nicholas Lewin-Koh [mailto:nikko at hailmail.net]
Sent: 24 February 2004 08:33
To: bioconductor at stat.math.ethz.ch
Cc: rdiaz at cnio.es
Subject: [BioC] testing GO categories with Fisher's exact test.


Hi all,
I have a few questions about testing for over representation of terms in
a cluster.
let's consider a simple case, a set of chips from an experiment say
treated and untreted with 10,000
genes on the chip and 1000 differentially expressed. Of the 10000, 7000
can be annotated and 6000 have
a GO function assinged to them at a suitible level. Say for this example
there are 30 Go clasess that appear.
I then conduct Fisher's exact test 30 times on each GO category to detect
differential representation of terms in the expressed
set and correct for multiple testing.

My question is on the validity of this procedure. Just from experience
many genes will
have multiple functions assigned to them so the genes falling into GO
classes are not independent.
Also, there is the large set of un-annotated genes so we are in effect
ignoring the influence of 
all the unannotated genes on the outcome. Do people have any thoughts or
opinions on these approaches? It is
appearing all over the place in bioinformatics tools like FATIGO, EASE,
DAVID etc. I find that 
the formal testing approach makes me very uncomfortable, especially as
the biologists I work with tend to over interpret the results.
I am very interested to see the discussion on this topic.

Nicholas

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor



More information about the Bioconductor mailing list