[BioC] "automatic association analysis"

Francois Pepin fpepin at cs.mcgill.ca
Fri Aug 25 18:04:43 CEST 2006


Hi Weiwei,

If you want to know if a given set of genes (ie members of the pathway)
are behaving differently in a given set of arrays (ie your disease
samples), there are a few ways. The basic way to do this would be to use
an hypergeometric test (often used in the case of GO), although it can
be tricky to get right and has a few other issues.

There are other methods, such as the Gene Set Enrichment method in the
Category package, that combine a set of t-tests together. Other packages
like safe and sigPathway have different methods of doing the same thing.
There was a discussion on this recently on the mailing list, you would
probably want to look over it.

As far as I can tell, all of those methods require that you have your
pathway already defined. Some databases like KEGG or BioCarta have
pathway definitions, but they're don't cover all pathways and few, if
any, are up-to-date with the literature.

If we really care about a given pathway, we'll go and create our own
list ourselves from the database. It is important in such a case to
create the list before you've started looking at the differentially
expressed genes, because you would be biasing your results. Of course,
it is always good to be able to explain your results a biologically
afterward, but this is not the same as showing a statistically
significant correlation with a pathway.

Hope this helps,

Francois

On Thu, 2006-08-24 at 18:57 -0400, Weiwei Shi wrote:
> Dear Listers:
> 
> I have a question originated from pathway analysis:
> 
> Suppose i have found a pathway which strongly associates with a
> disease from pathway analysis; my question is on how to validate this
> rule? I mean, is there any tool doing some automatic association
> analysis with scientific record like PubMed and it can give some
> evaluation on the strength of such association.
> 
> thanks.
>



More information about the Bioconductor mailing list