[BioC] how to test for genes of interest?

Glyn Bradley glyn.bradley at googlemail.com
Thu Jul 24 18:17:58 CEST 2008


'mine it for biological significance' used to be a very vague
idea...until the advent of Ingenuity Pathway Analysis!!   :)

If it were me and I was going to do FDR, I'd do it on the full list.

Glyn

On Thu, Jul 24, 2008 at 5:00 PM, Jenny Drnevich <drnevich at illinois.edu> wrote:
> HI Glyn,
>
> "...mine it for biological significance" is a very vague, and in my
> experience, very subjective sort of analysis. I do agree that with a
> particular list, in this case immune genes, doing something like GSEA could
> be appropriate. However, GSEA gives an answer along the lines of "yes,
> immune genes appear to be important" and not "which immune genes are
> changing, and which are not?" Besides, data mining is not included in my
> basic statistical analysis service. :)  I was just wondering if one was
> going to do the analysis I described, what is the proper way to do it?
>
> Thanks,
> Jenny
>
> At 10:48 AM 7/24/2008, Glyn Bradley wrote:
>>
>> Hi Jenny
>> I may get shot down horribly for saying this on this list, but isn't
>> there a large school of thought which says don't do FDR at all, just
>> take the large list of genes out and mine it for biological
>> significance.
>> Certainly a large pharma I've a little experienmce of takes that
>> approach. Stats are just stats afterall. (and I'm sure you're going to
>> validate the results with some other wet lab technique anyway).
>>
>>
>> Glyn PhD
>> Bioinf and systems modelling
>> mycib.ac.uk
>>
>> On Thu, Jul 24, 2008 at 4:14 PM, Jenny Drnevich <drnevich at illinois.edu>
>> wrote:
>> > Hi everyone,
>> >
>> > I've always heard that one of the ways "around" the multiple testing
>> > problem
>> > of microarrays is for you to a priori identify a particular list of
>> > genes
>> > you're interested in, and then you only have to do the multiple test
>> > correction for this smaller list. I've never done this in practice, and
>> > I'm
>> > not sure at what point in the analysis it's proper to pull out just the
>> > smaller list. Obviously, all the data preprocessing and normalization
>> > will
>> > be done with all the genes, but should I pull out the genes before
>> > fitting
>> > the model, or after fitting the model right before the multiple test
>> > adjustment? I'm using the eBayes() shrinkage in limma, so which genes
>> > are in
>> > the model will make a big difference in the outcome.
>> >
>> > I'm thinking it would be best to keep all the genes in the model, and
>> > then
>> > split them out into two groups (genes of interest and all the rest) and
>> > do a
>> > FDR correction separately for each group. What do you think?
>> >
>> > Thanks,
>> > Jenny
>> >
>> > Jenny Drnevich, Ph.D.
>> >
>> > Functional Genomics Bioinformatics Specialist
>> > W.M. Keck Center for Comparative and Functional Genomics
>> > Roy J. Carver Biotechnology Center
>> > University of Illinois, Urbana-Champaign
>> >
>> > 330 ERML
>> > 1201 W. Gregory Dr.
>> > Urbana, IL 61801
>> > USA
>> >
>> > ph: 217-244-7355
>> > fax: 217-265-5066
>> > e-mail: drnevich at illinois.edu
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at stat.math.ethz.ch
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>
> Jenny Drnevich, Ph.D.
>
> Functional Genomics Bioinformatics Specialist
> W.M. Keck Center for Comparative and Functional Genomics
> Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
>
> 330 ERML
> 1201 W. Gregory Dr.
> Urbana, IL 61801
> USA
>
> ph: 217-244-7355
> fax: 217-265-5066
> e-mail: drnevich at illinois.edu
>



More information about the Bioconductor mailing list