[BioC] Limma: four suggestions related to weights and background correction

Fri Jan 23 17:05:10 MET 2004

Dear Gordon and other bioconductor-users,

We have encountered a few recurrent problem when using limma. If I may, I'd 
like to make four suggestions for changes which might benefit other people:

1. When "backgroundCorrect(temp1, method = "none")" is used columns for Rb and 
Gb are eliminated from the RGList object. We have found this problematic, 
because, even if one does not want to do background subtraction, one might 
still want to look at image plots of background, just to check nothing funny 
has happened with the background. The problem is that the subsequent step of 
using "normalizeWithinArrays" calls MA.RG, which itself uses the presence or 
absence of the Rb and Gb to do (or not) the subtraction. Then, if we still 
want to plot background and do the normalization, we either have to make sure 
we are done with all background business before we do the normalization 
(since after that all background info will be lost) or we need to keep the 
normalized and the unnormalized objects around. None of these seem ideal. We 
wonder if it would be possible to pass the type of subtraction to 
"normalizeWithinArrays" and to "MA.RG" as a parameter, so that the original 
object preservers all the information, but still we can do no background 
subtraction. (We are hacking the code here, for our internal use).

2. We have found it slightly confussing that the M value of a point with 
weight 0 for the normalization is not an NA. Sure, the points with weight = 0 
do not get used to fit the loess curve, but they still have a residual. But, 
so far, with the wet lab people we have worked, it has always been the case 
that if a point was deemed unsuitable for the normalization it was also 
unsuitable for further analyses (though, occasionally, points suitable for 
fitting the normalization curve have been deemed unsuitable for further 
analyses). 

Are we missing something obvious here?

3. Occasionally, a complet print-tip group might be missing which will make 
"normalizeWithinArrays" to fail. We have added the following to that 
function, in the print-tip loess case, right before the line "spots <- spots 
+ nspots" (and substituting the direct call to loessFit):

******************************************
num.fill <- length(object$M[spots, j]) 
object$M[spots, j] <- rep(NA, num.fill) 
no.objects <-
    try(object$M[spots, j]
        <- loessFit(y, x, w, span = span, 
                    iterations = iterations)$residuals) 
if(class(no.objects) == "try-error")  
    print(paste("WARNING: in print-tip loess normalization", 
                "no samples in array", j, "grid row", gridr,
                "grid column", gridc))
********************************************

4. When all weights are either 0 or 1, there are slight differences in the 
results of the normalization (of points with weight = 1) depending on whether 
or not the cases with weight of 0 are passed to subsequent calls. Sure, these 
differences are irrelevant, but it can be disconcerting. Wouldn't it be 
appropriate to check for the binary situation of weights either 0 or 1, and 
if so, exclude points with weight 0, which would allow us to call directly 
".C("lowess"", instead of ".vsimpleLoess" (inside "loessFit")?

***************
The following is an example.
(The data sets are at:
http://bioinfo.cnio.es/~rdiaz/00G66.txt)

temp01 <- read.maimages(files = "00G66.txt",
                       source = "genepix",
                       wt.fun=wtflags(0))

temp01.b <- temp01
set.to.na <- which(temp01.b$weights == 0)
temp01.b$R[set.to.na] <- NA
temp01.b$G[set.to.na] <- NA

temp01.c <- temp01.b
temp01.c$weights <- NULL

n1c <- normalizeWithinArrays(temp01.c, layout = temp.layout,
                             iterations = 5,
                             method = "loess")
n1b <- normalizeWithinArrays(temp01.b, layout = temp.layout,
                             iterations = 5,
                             method = "loess")
n1 <- normalizeWithinArrays(temp01, layout = temp.layout,
                             iterations = 5,
                            method = "loess")

pairs(cbind(n1$M, n1b$M, n1c$M))
summary(n1$M - n1b$M)

******************

Best,

R.

-- 
Ramón Díaz-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900

http://bioinfo.cnio.es/~rdiaz
PGP KeyID: 0xE89B3462
(http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)