[BioC] Should I skip the eBayes step when using Limma for Affymetrix miRNA v1 chip

Gordon K Smyth smyth at wehi.EDU.AU
Sun Aug 24 02:59:54 CEST 2014

Dear Scott,

Just for the record, here is how you can incorporate probe-set size into 
the eBayes step to see whether it is important.  I will simulate a little 
toy example where we know there should be a trend:

   n <- 1:100 # number of probes per probe-set
   ID <- rep(1:100,n) # probe-set IDs
   nprobes <- length(ID)
   x <- matrix(rnorm(nprobes*3),nprobes,3)
   y <- avereps(x,ID=ID)

# y has 100 rows, each row is an average of n probes

   design <- matrix(1,3,1)
   fit <- lmFit(y,design)
   fit$Amean <- log(n)
   fit <- eBayes(fit,trend=TRUE)

Limma will estimate a decreasing trend of variance vs n as well as doing 
empirical Bayes squeezing around the trend.  The x-axis label of the plot 
will say "average expression" but it is actually log(n).

Best wishes

On Fri, 22 Aug 2014, Gordon K Smyth wrote:

> Dear Scott,
>> Date: Thu, 21 Aug 2014 04:49:49 -0700 (PDT)
>> From: "Scott Robinson [guest]" <guest at bioconductor.org>
>> To: bioconductor at r-project.org, scott.robinson at glasgow.ac.uk
>> Subject: [BioC] Should I skip the eBayes step when using Limma for
>> 	Affymetrix miRNA v1 chip?
>> Dear List,
>> I am working with Affymetrix's miRNA V1 chip, which uses very different 
>> probe sets for different molecule types, e.g. 4 identical probes for one 
>> miR, or 11 different probes for a snoRNA.
>> I have read that the eBayes step assumes equal error variance between probe 
>> sets so it is not suitable for this kind of mixed set of probe set designs.
> Having a variances from the same distribution is not the same as having the 
> same variance.
>> To further complicate matters I am thinking about generating a custom CDF 
>> where the miR probe sets would have varied number of probes.
>> http://pomelo2.bioinfo.cnio.es/help/pomelo2-help.html#toc10
>> Should I look at everything through Limma without the eBayes step (making 
>> it equivelant to a normal t-test?),
> That would throw the baby out with the bath water.
> I doubt that the error variance depends quite as directly on the number of 
> probes in a probe-set as you might think.  When we have analysed the miRNA 
> Affymetrix chip, we have found that it has major problems from the point of 
> view of normalization, while the issue that you raise is relatively minor.
> I could suggest ways to take into account the number of probes per probe-set 
> in the eBayes calculations, but I don't think this will be important.
> Best wishes
> Gordon
> PS.  If you have the choice, RNA-seq is cheaper and better.
>> or separate into several different analyses for different molecule types 
>> and only drop the eBayes step for the miRs (which will have varying sizes 
>> of probe sets)?
>> Many thanks,
>> Scott
>> -- output of sessionInfo():
>>> sessionInfo()
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> locale:
>> [1] LC_COLLATE=English_United Kingdom.1252
>> [2] LC_CTYPE=English_United Kingdom.1252
>> [3] LC_MONETARY=English_United Kingdom.1252
>> [5] LC_TIME=English_United Kingdom.1252
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> --
>> Sent via the guest posting facility at bioconductor.org.

The information in this email is confidential and intend...{{dropped:4}}

More information about the Bioconductor mailing list