[BioC] necessity of moderated t statistic and false discoveries for small predefined gene list?

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu May 17 17:54:58 CEST 2012


Hi Richard,

It seems to me that this paper is highly relevant to the question you
are trying to answer:

Independent filtering increases detection power for high-throughput experiments
http://www.pnas.org/content/107/21/9546.full

Perhaps you can see where your "filtering scheme" lands in the
landscape of filters described there.

HTH,
-steve

On Thu, May 17, 2012 at 9:25 AM, Richard Friedman
<friedman at cancercenter.columbia.edu> wrote:
> Moshe,
>
>        Thank you for the clarification on the moderated t-statistic.
> If I am only interested in 10 genes is it better to calculate the moderated
> statistic and hence raw p-values based on all of the genes on the array
> or just thoe 10 genes?
>
> Best wishes,
> Rich
>
>
> On May 17, 2012, at 12:35 AM, Moshe Olshansky wrote:
>
>> Hi Rich,
>>
>> I think that Gordon Smyth (the author of limma) has explained at this list
>> what moderated t-statistic is.
>> The brief explanation is that when there are few samples the estimate of
>> the variance which is used in a standard t-test is quite noisy and because
>> one must account for this noise the standard t-test has a low statistical
>> power. The Empirical Bayes model used in the moderated t-tests allows to
>> estimate the variance with more confidence and therefore has a better
>> power. So it can be used even if you are interested in just a few genes.
>> It has (almost) nothing to do with the multiple testing adjustment. Well,
>> one may ask whether moderated p-values satisfy the assumptions of multiple
>> testing adjustment procedures (in particular the BH), but this is another
>> story. May be Gordon will comment on this.
>>
>> Best regards,
>> Moshe.
>>
>>> Moshe and List,
>>>
>>>        Thanks for yoru reply. The method you describe retains
>>> the raw p-value based on the moderated t-statistic and adjusts
>>> it to give an adjusted p-value (usually a false discovery rate).
>>> However, as I understand it, the moderated
>>> t-statistic given by Limma based on
>>> all of the genes in the array, pools variance information
>>> to moderate the standard deviation to prevent fortuitously
>>> low p-values stemming from fortuitously low standard deviations
>>> encountered in thousands of multiple tests.I am wondering
>>> that if the experimentalist asks me to look up just 10 genes
>>> I should use the unmoderated frequentist t-statistic which
>>> will differ from the one in Limma and may imply significance
>>> where Limma does not. I guess another way to phrase it is
>>> "How many simulataneous tests does one need before one
>>> should prefer the moderated statistic to the empirical
>>> Bayesian one". Or should I fit just those 10 genes
>>> (~30 affy probes) with Limma?
>>>
>>> Best wishes,
>>> Rich
>>>
>>>
>>>
>>> On Thu, 17 May 2012, Moshe Olshansky wrote:
>>>
>>>> Hi Rich,
>>>>
>>>> Whether to use the moderated t-statistic or not does not depend on
>>>> whether
>>>> you are interested in the 10 particular genes or in all differentially
>>>> expressed ones. This will affect your multiple testing adjustment.
>>>> The simplest way for you to proceed is to use limma as usual, get the
>>>> topTable but then take the UNADJUSTED p-values for your 10 genes of
>>>> interest and use the p.adjust function to adjust for multiple testing if
>>>> you wish. In any case you should also look at (log)Fold Changes.
>>>>
>>>> Best regards,
>>>> Moshe.
>>>>
>>>>
>>>>> Dear Bioconductor  List.
>>>>>
>>>>>        I am using Limma to analyze differential expression between 2
>>>>> conditions on an Affy chip.
>>>>> My experimental collaborator asks for the differential  expression of
>>>>> 10 predefined genes.
>>>>>
>>>>> A, Should I correct for false discoveries based upon all of the genes
>>>>> on the chip?
>>>>> B. If not, should I correct for false discoveries just for the
>>>>> probeids for the 10 predefined
>>>>> genes?
>>>>> C. Should I use the moderated t-statistic or just use an unmoderated t-
>>>>> test for those 10
>>>>> genes.
>>>>>
>>>>> Thanks and best wishes,
>>>>> Rich
>>>>> ------------------------------------------------------------
>>>>> Richard A. Friedman, PhD
>>>>> Associate Research Scientist,
>>>>> Biomedical Informatics Shared Resource
>>>>> Herbert Irving Comprehensive Cancer Center (HICCC)
>>>>> Lecturer,
>>>>> Department of Biomedical Informatics (DBMI)
>>>>> Educational Coordinator,
>>>>> Center for Computational Biology and Bioinformatics (C2B2)/
>>>>> National Center for Multiscale Analysis of Genomic Networks (MAGNet)
>>>>> Room 824
>>>>> Irving Cancer Research Center
>>>>> Columbia University
>>>>> 1130 St. Nicholas Ave
>>>>> New York, NY 10032
>>>>> (212)851-4765 (voice)
>>>>> friedman at cancercenter.columbia.edu
>>>>> http://cancercenter.columbia.edu/~friedman/
>>>>>
>>>>> "School is an evil plot to suppress my individuality"
>>>>>
>>>>> Rose Friedman, age15
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> ------------------------------------------------------------
>>> Richard A. Friedman, PhD
>>> Associate Research Scientist
>>> Herbert Irving Comprehensive Cancer Center
>>> Biomedical Informatics Shared Resource
>>> Lecturer
>>> Department of Biomedical Informatics
>>> Box 95, Room 130BB or P&S 1-420C
>>> Columbia University Medical Center
>>> 630 W. 168th St.
>>> New York, NY 10032
>>> (212)305-6901 (5-6901) (voice)
>>> friedman at cancercenter.columbia.edu
>>> http://cancercenter.columbia.edu/~friedman/
>>>
>>> "The last 250 pages of the last Harry Potter
>>> book took place in one day because alot
>>> happened in that day. All of Ulysses takes
>>> place in one day and nothing happened in that day."
>>> -Rose Friedman, age 11
>>>
>>>
>>
>>
>>
>> ______________________________________________________________________
>> The information in this email is confidential and inte...{{dropped:6}}
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list