[BioC] QA of two-color array data

Robert Castelo robert.castelo at upf.edu
Wed Oct 28 12:13:57 CET 2009


Naomi,

i think you can dismiss my previous email below, i thought i'd really
like to try the normexp method with mle estimates and couldn't wait till
Gordon's patch to using an RGlist with the normexp-mle method would show
up in the bioconductor server. so i hacked the RG object to extract a
matrix of the red and green intensities and use this method with the
matrix which according to Gordon would do the expected job. then i paste
the result again into my RGlist object.

after the normalization steps the result is that using the normexp
method with mle estimates the biases have been successfully removed:

http://functionalgenomics.upf.edu/QA/MA-plotsNoCtrlsNEmle1.png
http://functionalgenomics.upf.edu/QA/MA-plotsNoCtrlsNEmle2.png

Naomi, thanks again, and i should also thank Tobias Straub for raising
the issue with the implementation of the normexp method, James McDonald
for drawing Gordon's attention and, of course, Gordon Smyth for clearing
up the issue so quickly.

robert.

On Wed, 2009-10-28 at 10:31 +0100, Robert Castelo wrote:
> thanks Naomi, i guess this is embarrasingly obvious :-} i've made the
> plot without the Agilent control spots and the two clusters with low
> M-values have dissappear from these plots:
> 
> http://functionalgenomics.upf.edu/QA/MA-plotsNoCtrls1.png
> http://functionalgenomics.upf.edu/QA/MA-plotsNoCtrls2.png
> 
> and now stand out even more clearly the intensity dependent biases for
> some of the arrays. i find them a bit weird in the sense that it is not
> a bias affecting the bulk of the probes with low intensities but a
> subset of them. i've googled about this but found only success stories
> about removing such bias after background correction and normalization.
> 
> if i look to the MA-plots for the raw data (from the 'RG' object)
> excluding control spots:
> 
> http://functionalgenomics.upf.edu/QA/MA-plotsRawNoCtrls1.png
> http://functionalgenomics.upf.edu/QA/MA-plotsRawNoCtrls2.png
> 
> i see the bias affecting the bulk of probes with low intensities for
> those problematic cases, so i guess the problem might be that i'm not
> using appropriate background correction and/or normalization algorithms.
> 
> as shown in my previous email i'm currently using 'normexp' with
> 'mle' (which if i correctly interpret a recent post from Gordon, the
> version i used is in fact employing 'saddlepoint' estimates instead of
> 'mle'), loess within-normalization and scale between-normalization.
> 
> do you, or anybody in the list, have any hint on how could i preprocess
> these data in order to try to remove those artifacts?
> 
> thanks,
> 
> robert.
> 
> 
> On Tue, 2009-10-27 at 11:38 -0400, Naomi Altman wrote:
> > The weird spots are probably the Agilent quality control 
> > spots.  Remove them and redo the plot.
> > 
> > --Naomi
> > 
> > At 05:53 AM 10/27/2009, Robert Castelo wrote:
> > >dear list,
> > >
> > >i have very limited experience in the QA of microarray data and i'd like
> > >to know the opinion from people with more experience with this job if
> > >there are issues with the QA of the data i'm analizing, and if could
> > >pre-process these data differently in order to try to correct for the
> > >possible QA problems.
> > >
> > >i'm re-analizing a series of 12 two-color microarray experiments
> > >deposited in GEO (acc. GSE13943). these are custom 4x44K Agilent arrays
> > >with probes targeting exons and splice junctions in Drosophila
> > >Melanogaster. the experiments correspond to RNAi knock-downs of 4
> > >RNA-binding proteins -hrp36, hrp38, hrp40 and hrp48- (red channel)
> > >against a non-specific RNAi control (green channel) in three independent
> > >replicates for each KO experiment.
> > >
> > >after reading the raw data files into an RGlist object called 'RG' i've
> > >performed background correction, within- and between-normalization as
> > >follows:
> > >
> > >RGneMLE <- backgroundCorrect(RG, method="normexp", normexp.method="mle",
> > >offset=50)
> > >
> > >MA <- normalizeWithinArrays(RGneMLE[RGneMLE$genes$ControlType!=-1,],
> > >                             method="loess", bc.method="none")
> > >
> > >MA <- normalizeBetweenArrays(MA, method="scale")
> > >
> > >i have produced the corresponding MA-plots of the latter pre-processed
> > >MA data object for each of the 12 arrays which i've put on the web so
> > >that you can take a look at them:
> > >
> > >http://functionalgenomics.upf.edu/QA/MA-plots1.png
> > >
> > >http://functionalgenomics.upf.edu/QA/MA-plots2.png
> > >
> > >when i look to these plots i see the following two unexpected features:
> > >
> > >-in the replicates of hrp36, replicate 1 of hrp38, replicate 1 of hrp40
> > >and replicate 2 of hrp48 there are some small intensity dependent biases
> > >affecting to the low average values A.
> > >
> > >-through all replicates i see two clusters of probes with low M values
> > >(i.e., higher green signal).
> > >
> > >if i look to the image plots (generated with 'imageplot3by2(RG)'):
> > >
> > >http://functionalgenomics.upf.edu/QA/image-Gb-1-6.png
> > >
> > >http://functionalgenomics.upf.edu/QA/image-Gb-7-12.png
> > >
> > >i see some line crossing from the top to the bottom, but i don't know if
> > >this is related to the issues raised before.
> > >
> > >i've run the array quality metrics package thorugh these data with the
> > >following command:
> > >
> > >arrayQualityMetrics(expressionset=RG, outdir="aqm", force=TRUE)
> > >
> > >and put the output here:
> > >
> > >http://functionalgenomics.upf.edu/QA/aqm/QMreport.html
> > >
> > >according the this report there are no outlier arrays and so i'm
> > >wondering whether maybe in fact there are no QA problems and simply i'm
> > >not using the appropriate pre-processing algorithms for this kind of
> > >data.
> > >
> > >thanks!
> > >robert.
> > >
> > >_______________________________________________
> > >Bioconductor mailing list
> > >Bioconductor at stat.math.ethz.ch
> > >https://stat.ethz.ch/mailman/listinfo/bioconductor
> > >Search the archives: 
> > >http://news.gmane.org/gmane.science.biology.informatics.conductor
> > 
> > Naomi S. Altman                                814-865-3791 (voice)
> > Associate Professor
> > Dept. of Statistics                              814-863-7114 (fax)
> > Penn State University                         814-865-1348 (Statistics)
> > University Park, PA 16802-2111
> > 
> >
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list