[BioC] Testing biased microarray data

Simon Anders anders at embl.de
Mon Mar 28 17:49:13 CEST 2011


Hi January

On 03/28/2011 05:02 PM, January Weiner wrote:
> the following problem: samples are either RNA, or RNA with selective
> depletion of some forms of RNA. In short, the relative abundance in
> the second group of samples should always be equal to or smaller than
> that in the control, but never higher. The difference in abundance
> might concern a substantial fraction of mRNAs (10-50%).
>
> Naturally, when the samples are normalised, since the total transcript
> abundance in the experimental group is significantly lower, the
> relative abundance of transcripts with no change will be higher in the
> experimental group, and artifacts will occur: we will observe genes
> that are apparently up-regulated, although in reality their levels
> remain stable.

We faced the same problem a while ago in a project comparing mRNA from 
fertilized vs unfertilized Drosophila eggs. In Drosophila eggs, an mRNA 
degradation machinery is activated when the egg is layed, and many 
maternally deposited transcripts get degraded within a couple of hours. 
We had three time points, and in the unfertilized eggs, the transcript 
levels could only be lower but not higher in the later compared to the 
earlier time points, similar to your setting.

We solved the issue by first using VSN (with an increased trimming 
quantile), followed by LOESS and then RMA, and this worked very well.

Have a look at this image:

http://www.embl.de/~anders/misc_pub/FlyEggs_mod_vs_loess.png

Each panel is an MA plot, comparing the indicated array with an average 
over all arrays. The two lines are the LOESS fit lines (with two 
slightly different settings).

Look, for example, at the four arrays for the late unfertilized time 
point ("unf.3"): The triangle towards the bottom left corner are the 
decayed genes. They are lower than average (i.e., below y=0) and, as 
they are gone, also to the left -- hence the triangle. The LOESS line 
clearly follows the bulk of non-decayed genes and is not deterred by the 
triangle. Other normalization techniques such as RMA only (without 
preceding LOESS) or quantile normalization did not do the job.

I can send you a code example if you need it. For further details, 
please see our paper and especially page 4 of the supplement:

Thomsen S, Anders S, Chandra Janga S, Huber W, Alonso CR. Genome-wide 
analysis of mRNA decay patterns during early Drosophila development.
Genome Biology, 11 (2010) R93.
http://genomebiology.com/2010/11/9/R93


   Simon



+---
| Dr. Simon Anders, Dipl.-Phys.
| European Molecular Biology Laboratory (EMBL), Heidelberg
| office phone +49-6221-387-8632
| preferred (permanent) e-mail: sanders at fs.tum.de



More information about the Bioconductor mailing list