[BioC] global vs. separate in limma decideTests
Gordon K Smyth
smyth at wehi.EDU.AU
Thu May 8 02:27:24 CEST 2008
> Date: Tue, 6 May 2008 11:22:32 -0700
> From: "Donna Toleno" <toleno at usc.edu>
> Subject: [BioC] global vs. separate in limma decideTests
> To: "bioconductor at stat.math.ethz.ch" <bioconductor at stat.math.ethz.ch>
> Content-Type: text/plain
>
> Hello,
>
> I have a short question about decideTests. I've read a few postings on this
> topic and I thought that I understood the difference between "global" and
> "separate". My understanding is that "global" considers all the contrasts
> which means it is considering many more hypothesis tests than "separate". I
> ran some analysis using "global" and then I realized that since my gene
> lists of interest were different for each contrast, I decided that
> "separate" may be the better choice. What puzzled me is that I found more
> differential expression using "global" than I did with "separate". I thought
> that more tests would always lead to higher adjusted p-values and fewer
> inferences of differential expression.
This is generally true for p-value adjustments like Bonferroni or Holm but
not for FDR methods like BH. With FDR, the proportion of DE genes can go
up or down as you add more tests. The reason for this is that FDR is
scalable: if you can keep the FDR below a proportion p in several separate
sets of tests, then it follows that you've also kept FDR below p in the
combined set of tests.
Strictly speaking, the proportion of DE can go up with Holm also as you
add more tests, but this doesn't happen so often.
The reason for this phenomenon is the step-up or step-down nature of these
adjustment methods, which takes the whole set of p-values into account
when adjusting each one.
Best wishes
Gordon
> There was a note of caution from
> Gordon Smyth posted on the list about being careful not to include spurious
> contrasts. In my case, the other contrasts are not really spurious, they
> just not the contrasts of interest for that particular subset. The subsets
> do overlap and probably by a large fraction, that was why I chose "global"
> at first.
>
> So here is some example code:
>
>> filtered_results_global_ref <-
> decideTests(fit2_tissues,method="global",adjust.method="BH",p.value=0.05,lfc=log2(1.2))[names(which(selected)),
> 1]
>> standard_DE_down_global <- which (filtered_results_global_ref== -1)
>> length(standard_DE_down_global)
> [1] 274
>
> repeating the above using "separate" gives me fewer significant
> down-regulated genes.
> [1] 244
>
> filtered_results_global_ref <- decideTests(fit2_tissues
> [names(which(selected)),
> 1],method="global",adjust.method="BH",p.value=0.05,lfc=log2(1.2)) #This one
> is the same as using method "separate" because I really am only considering
> one contrast.
>> standard_DE_down_global <- which (filtered_results_global_ref== -1)
>> length(standard_DE_down_global)
> [1] 244
>
> My questions are 1. Why are there fewer significant results for "separate"
> than for "global"?
More information about the Bioconductor
mailing list