[BioC] limma and minimum fold change

Mark Cowley m.cowley at garvan.org.au
Sun Nov 14 23:51:07 CET 2010


Hi Eva,
Volcano plots are very useful for showing the relationship between logFC and pvalue.

# logFC vs uncorrected p-values
plot(sign$logFC, -log10(sign$p.value))
# logFC vs corrected p-values
plot(sign$logFC, -log10(sign$adj.p.val))

I suspect you'll see a distinct V shape, with large dispersion on either side of the V

cheers,
Mark
On 15/11/2010, at 4:43 AM, Wolfgang Huber wrote:

> 
> Hi Eva
> 
> in the ordinary t-test, the t-statistic is proportional to the ratio of fold change 'fc' and estimated within-group standard deviation 's'; and the p-value depends on that through a monotonous transformation, ie. setting a p-value cutoff is equivalent to setting a cutoff on |t|. s_gene is simply, for each gene separately, the empirical standard deviation.
> 
> With limma/eBayes, you replace s by a cleverly calibrated compromise s_limma between s_gene and an overall pooled s_tot (across all genes). With a small number of replicates, that compromise is close to s_tot; with many replicates, close to s_gene; and somewhere inbetween in intermediate cases. Now, if (eBayes decides that) you have few replicates, then s_gene ~ s_tot is always the same for all genes, and your p-value cutoff is directly equivalent to a cutoff on fc. This, it appears, is what you see.
> 
> Have a look, in your 'eb' object, at the slots 'df.prior' and 'df.residual'. The manual page of eBayes says: "s2.post is the weighted average of s2.prior and sigma^2 with weights proportional to df.prior and df.residual respectively."
> 
> 	Best wishes
> 	Wolfgang
> 
> 
> Il Nov/13/10 10:44 PM, Sean Davis ha scritto:
>> On Sat, Nov 13, 2010 at 1:56 PM, Eva Benito Garagorri<ebenito at umh.es>wrote:
>> 
>>> Dear list,
>>> 
>>> I have been working with some microarray data where I make simple
>>> comparisons between a control and a treatment group. I used vsn to normalize
>>> and limma to find differentially expressed genes. I used "topTable" with
>>> number=Inf and p.value=0.05 to get all significant genes for a given
>>> condition, regardless of their fold change. I seem to be getting significant
>>> genes with a fold change systematically bigger than 1.3 (linear scale), both
>>> up and downregulated. I was wondering whether this can happen and why or
>>> whether I am missing something or making a mistake in the analysis. I guess
>>> I expected that some genes would be significant even with a very small fold
>>> change. I tried the same analysis with the ALL dataset and I found that
>>> genes were significant with a fold change above ~1.1. Below is the code I
>>> used for the analysis of the ALL dataset, which is essentially the same I
>>> used for my own analysis. Thanks in advance!
>>> 
>>> 
>> Hi, Eva.  I think what you are saying is that you are seeing genes that
>> change by 30% and that you are concerned that you should see genes with a
>> smaller fold change being significant?  The fold change that is significant
>> will be dependent on a number of factors, but sample size and biologic
>> factors such as the amount of heterogeneity in the two conditions are strong
>> contributors.  I wouldn't say that there is anything wrong with your
>> results, at least without knowing more detail.
>> 
>> Sean
>> 
>> 
>> 
>>>  Eva
>>> 
>>> 
>>> library(ALL)
>>> data(ALL)
>>> library(limma)
>>> 
>>> 
>>> f = ALL$mol.biol
>>> 
>>> mat = model.matrix(~f, ALL)
>>> lm = lmFit(ALL, mat)
>>> eb = eBayes(lm)
>>> 
>>> sign = topTable(eb, coef=2,number=Inf,p.value=0.05)
>>> dim(sign)
>>> 
>>> sort(2^sign$logFC)
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> sessionInfo()
>>> R version 2.11.1 (2010-05-31)
>>> x86_64-apple-darwin9.8.0
>>> 
>>> locale:
>>> [1] es_ES.UTF-8/es_ES.UTF-8/C/C/es_ES.UTF-8/es_ES.UTF-8
>>> 
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>> 
>>> other attached packages:
>>> [1] limma_3.4.4   ALL_1.4.7     Biobase_2.8.0
>>> 
>>> loaded via a namespace (and not attached):
>>> [1] tools_2.11.1
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ----------
>>> Eva Benito Garagorri
>>> PhD program in Neurosciences
>>> Institute for Neurosciences in Alicante
>>> UMH-CSIC
>>> San Juan de Alicante
>>> 03550
>>> Spain
>>> ebenito at umh.es
>>> (34) 965 91 92 33
>>> 
>>> 
>>> 
>>> 
>>> 
>>>        [[alternative HTML version deleted]]
>>> 
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> 
>> 
>> 	[[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list