[BioC] RMA-bimodality:

Mon Jun 5 19:28:43 CEST 2006

Wolfgang,

Thank you for your reply.

Just so that I am clear- the point is that the
bimodality is not an artifact of the convolution, but
simply the fact that the number of modes of a
distribution is not conserved under monotonous
transformations. This is why the paper points to the
fact that the histograms of log2 (PMs/MMs) stratified
by log2(PMs) is bimodal -so bimodality is a more
general property of the probe level data.

Please clarify if this is incorrect.

Thanks, 
Noel

--- Wolfgang Huber <huber at ebi.ac.uk> wrote:

> 
> Hi,
> 
> I am surprised why anybody is surprised about the
> different number of
> modes ("peaks"): the number of modes of a
> distribution is not conserved
> under monotonous transformations (such as the
> background correction in
> RMA), this simply follows from chain rule.
> 
> See below for a simple example with some "mock"
> microarray intensities z
> and density of log-transformed values before and
> after a (primitive)
> background background correction.
> 
> Cheers
>  Wolfgang
> 
> 
> set.seed(123)
> 
> n = 100000
> z = 20 + exp(c(rnorm(n), 3+rnorm(n)))
> 
> par(mfrow=c(1,2))
> plot(density(log2(z)))
> plot(density(log2(z-20)))
> 
> 
> noel0925 at sbcglobal.net wrote:
> > In the paper: Exploration, Normalization and
> Summaries
> > of High Density Oligonucleotide Array Probe Level
> Data
> > the following statement regarding the
> > bimodality of log2(PM) values and RMA background
> > corrected PM values can be found- "The same
> bimodal
> > effect is seen when we stratisfy by log2(PM), thus
> it
> > is not an artifact of conditioning on sums." (p4).
> > I am a little confused by this as I thought that
> > indeed an artifact of the convolution!
> > 
> > Clearly, the background corrected intensity
> > values are given by E(S | O) or the conditional
> > expectation of the signal given what we observe;
> where
> > the observed signal is the convolution of a
> normally
> > distributed background (N) mean mu variance
> sigma^2
> > (B~ N(u, ÃÆ’^2)) and an exponentially distributed
> > signal (S) with mean alpha (S~ exp(ÃŽÂ±)). 
> > 
> > There have been several postings regarding this
> matter
> > in the Bioconductor archives and all seem to point
> to
> > this. Have I misunderstood?
> > 
> > In particular was the following post:
> >
>
https://stat.ethz.ch/pipermail/bioconductor/2004-August/005908.html
> > (See below the response from zwu at jhsph.edu 
> > 
> > The original question I got was about the bimodal
> > distribution of gcrma
> > result from probe intensities with unimodel
> > distribution. My answer was
> > that the "change" was not necessarily surprising.
> > 
> > For example , when you have "true log signal" from
> a
> > bimodal distribution
> > logS=c(rnorm(1000,3,1),rnorm(1000,8,2))
> > # You will see this has two peaks
> > par(mfrow=c(2,2))
> > plot(density(logS))
> > #if the background, log(non-specific binding) come
> > from 
> > logB=rnorm(2000,6,1)
> > #then when you plot the histogram of convolution
> in
> > log scale,
> > plot(density(log(exp(logS)+exp(logB)))) 
> > #you see only one peak, and this would be "before
> > gcrma".
> > 
> > This explanation made sense to me, but seems to
> > contradict what is stated in the paper.
> > 
> > Also, can someone explain the difference between
> RMA
> > background version1 vs version2?
> > 
> > 
> > Best regards,
> > Noel
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 
> -- 
>
------------------------------------------------------------------
> Wolfgang Huber  EBI/EMBL  Cambridge UK 
> http://www.ebi.ac.uk/huber
> 
>