[BioC] Regarding quantile normalization.

Mon Jan 3 17:30:55 CET 2011

Veerendra GP <gpveerendra09 at ...> writes:

> 
> hi sean,
> I am sorry the code i mentioned in the last mail was not a running code.
> here i have displayed the complete running code.
> I have done the quantile normalization before multiple testing.
> 
> library (limma);
> targets <- readTargets("target.txt")
> #print (targets);
>
RG<-read.maimages(targets$FileName,source="agilent",
columns=list(R="rProcessedSignal",G="gProcessedSignal"),path="data_files");
> #to remove control spots
> status <- rep("gene", nrow(RG$genes));
> status[grep("UHNcntrl*", RG$genes[,"ProbeName"])] <- "cntrl";
> status[grep("UHNblank*", RG$genes[,"ProbeName"])] <- "cntrl";
> RGnc <- RG[status!="cntrl",];
> 
> MA.q <- normalizeBetweenArrays(RGnc, method="quantile")
> #To visualize data after quantile noralization
> pdf("MA.q_Density_Plot_after_quantile.pdf");
> plotDensities(MA.q);
> dev.off();

<snip>

> I agree that the boxplots should be identical after the quantile
> normalization but the boxplot which I have sent you in the last mail is
> obtained after the quantile normalization as per the above mentioned code.
> here I am also attaching the density plots for the same set of data.

hi - It is not clear to me what 'boxes' you have plotted whose sizes you expect
to be equal. According to your code, it appears that you are normalizing the 
red and green intensities to have the same distribution. So if you make 
boxplots of the normalized red and green intensities (separately) then the 
'boxplots should all be identical'. 

But if you want the boxplots of the _log ratios_ to be identical then you have
not done the normalization appropriately. Separate quantile normalization of 
the red and green channel intensities does not guarantee that the within slide
log ratios will have the same distribution across slides, in fact it would be 
rather surprising to me if they did. 

You also have not really made your aim very clear, as far as I can tell you have
only said that you 'need to do an interarray normalization'. There can be
different motivations or reasoning behind this, and that is what should guide
your choice of normalization. You might get an acceptable normalization by a
within slide normalization of the log ratios followed by between slide scale
normalization (or not, again depending on your experiment, which you have not
described at all). You can look at some of the other options for
normalizeBetweenArrays to see if one of those is appropriate for your aim.

Best regards,

Darlene

-- 
Darlene Goldstein
École Polytechnique Fédérale de Lausanne (EPFL)
Institut de mathématiques
Bâtiment MA, Station 8        Tel: +41 21 693 0528
CH-1015 Lausanne              Fax: +41 21 693 4303
SWITZERLAND