[BioC] Multiple test question in micrarray- FDR

Sat Dec 13 23:18:04 CET 2008

On Sat, Dec 13, 2008 at 4:01 PM, Wayne Xu <wxu at msi.umn.edu> wrote:
> Thanks, Sean,
> Your explanation makes sense to me. Is there any instruction for how to
> search this mailing list to read all the discussions about this topic as you
> say there are MANY discussions of this topic there?

At the bottom of every original post, there are some links that will
get you there.  In case you don't have access to an original post:

http://news.gmane.org/gmane.science.biology.informatics.conductor

Hope that helps.

Sean

> Sean Davis wrote:
>>
>> On Sat, Dec 13, 2008 at 12:36 PM, Wayne Xu <wxu at msi.umn.edu> wrote:
>>
>>>
>>> Hello,
>>> I am not sure this is a right place to ask this question, but it is about
>>> micrarray data analysis:
>>>
>>> In two group t test, the multiple test Q values are depending on the
>>> total
>>> number of genes in the test. If I filter the gene list first, for
>>> example, I
>>> only use those genes that have1.2 fold changes for T test and multiple
>>> test,
>>> this gene list is much smaller than the total gene list, then the
>>> multiple
>>> test q values are much smaller.
>>>
>>> Do you think above is a correct way? People who do not do that way may
>>> consider the statistical power may be lost? But how much power lost and
>>> how
>>> to calculate the power in this case?
>>>
>>
>> No, you cannot filter based on fold change.  However, you can filter
>> based on variance or some other measure that does not depend on the
>> two groups being compared.  Anything that filters genes based on
>> "knowing" the two groups will lead to a biased test.  Remember that
>> filtering removes genes from consideration from further analysis.
>>
>> For further details, there are MANY discussions of this topic in the
>> mailing list.
>>
>>
>>>
>>> When people report multiple test Q values, they usually do not mention
>>> how
>>> many genes are used in this multiple test. You can get different Q values
>>> (even use the same method, e.g. Benjamin and Holm adjust method) in the
>>> same
>>> dataset. Then how can it make sense if the same genes have different Q
>>> values?
>>>
>>
>> A good manuscript should describe in detail the preprocessing and
>> filtering steps, the statistical tests used, and the methods for
>> correcting for multiple testing.  You are correct that many papers do
>> not do so.
>>
>> As for different q-values in the same dataset using different methods,
>> it is important to note that one should not do an analysis, get a
>> result, and then, based on that result, go back and redo the analysis
>> with different parameters to get a "better" result.  It is very
>> important that each step of an analysis (preprocessing, filtering,
>> testing, multiple-testing correction) be justifiable independent of
>> the other steps in order for the results to be interpretable.
>>
>> Sean
>>
>
>
>