[BioC] global vs. separate in limma decideTests

Thu May 8 02:27:24 CEST 2008

> Date: Tue, 6 May 2008 11:22:32 -0700
> From: "Donna Toleno" <toleno at usc.edu>
> Subject: [BioC] global vs. separate in limma decideTests
> To: "bioconductor at stat.math.ethz.ch" <bioconductor at stat.math.ethz.ch>
> Content-Type: text/plain
>
> Hello,
>
> I have a short question about decideTests. I've read a few postings on this
> topic and I thought that I understood the difference between "global" and
> "separate". My understanding is that "global" considers all the contrasts
> which means it is considering many more hypothesis tests than "separate". I
> ran some analysis using "global" and then I realized that since my gene
> lists of interest were different for each contrast, I decided that
> "separate" may be the better choice. What puzzled me is that I found more
> differential expression using "global" than I did with "separate". I thought
> that more tests would always lead to higher adjusted p-values and fewer
> inferences of differential expression.

This is generally true for p-value adjustments like Bonferroni or Holm but 
not for FDR methods like BH.  With FDR, the proportion of DE genes can go 
up or down as you add more tests.  The reason for this is that FDR is 
scalable: if you can keep the FDR below a proportion p in several separate 
sets of tests, then it follows that you've also kept FDR below p in the 
combined set of tests.

Strictly speaking, the proportion of DE can go up with Holm also as you 
add more tests, but this doesn't happen so often.

The reason for this phenomenon is the step-up or step-down nature of these 
adjustment methods, which takes the whole set of p-values into account 
when adjusting each one.

Best wishes
Gordon

> There was a note of caution from
> Gordon Smyth posted on the list about being careful not to include spurious
> contrasts. In my case, the other contrasts are not really spurious, they
> just not the contrasts of interest for that particular subset. The subsets
> do overlap and probably by a large fraction, that was why I chose "global"
> at first.
>
> So here is some example code:
>
>> filtered_results_global_ref <-
> decideTests(fit2_tissues,method="global",adjust.method="BH",p.value=0.05,lfc=log2(1.2))[names(which(selected)),
> 1]
>> standard_DE_down_global <- which (filtered_results_global_ref== -1)
>> length(standard_DE_down_global)
> [1] 274
>
> repeating the above using "separate" gives me fewer significant
> down-regulated genes.
> [1] 244
>
> filtered_results_global_ref <- decideTests(fit2_tissues
> [names(which(selected)),
> 1],method="global",adjust.method="BH",p.value=0.05,lfc=log2(1.2)) #This one
> is the same as using method "separate" because I really am only considering
> one contrast.
>> standard_DE_down_global <- which (filtered_results_global_ref== -1)
>> length(standard_DE_down_global)
> [1] 244
>
> My questions are 1. Why are there fewer significant results for "separate"
> than for "global"?