[BioC] Defining Weights in marrayNorm.

Tue Aug 19 16:15:49 MEST 2003

Hi,

For multiple arrays as far as I can see there is currently no way that

you can make a call to maNormMain that will allow maLoess to utilise 
weights from EACH column of your maW matrix. 
maNormMain uses a controlling function maNormLoess specified as a LIST
of 
calls in the arguement:

 f.loc=list(maNormLoess(x="maA",y="maM", z="maPrintTip", w=Flagweights,
...)

maNormLoess calls maLoess. Inside the code for maLoess the actual loess
fit
has an arguement weights=args$w, where args$w will be the vector of
Flagweights.

 fit <- loess(y ~ x, weights = args$w, subset = args$subset, 
            span = args$span, na.action = args$na.action, degree =
args$degree, 
            family = args$family, control = args$control)

What was probably happening in your call

data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA",
y="maM", z="maPrintTip", w=data at maW)))

was that only the first column of data at maW was being used as weights
for the 
the normalization vectors x="maA data for each array", 
and y="maM data for each array" in the loess smoother.

e.g.
data(cars)
cars.lo <- loess(dist ~ speed, cars)
plot(cars.lo$x, cars.lo$y)
lines(cars.lo$x,cars.lo$fit)

set.seed(1)
weights <- matrix(rbinom(100,1,0.4),nc=2)
cars.lo <- loess(dist ~ speed, cars, weights=weights)

lines(cars.lo$x,cars.lo$fit, col="red")

cars.lo <- loess(dist ~ speed, cars, weights=weights[1:50,1])
lines(cars.lo$x,cars.lo$fit, lty=8, col="green")

marcus

>>> "michael watson (IAH-C)" <michael.watson at bbsrc.ac.uk> 14/08/2003
2:01:44 AM >>>
Hi

In my continuing search to get this working, I have made progress :-D 
But I think I found a bug/feature...

OK here is what I did. 

I took a GenePix file.  I made a copy of it.  I added column
"SpotWeight" to both.  In one of the files I set the weights all to 1. 
In the other, I set all of the weights to be between 0 and 0.5 (random
numbers).  I just wanted to see if I could get it working.

So:

> data = read.GenePix(fnames = files, name.Gf = "F532 Median", name.Gb
= "B532 Median", name.Rf = "F635 Median", name.Rb = "B635 Median",
name.W = "SpotWeight", layout=layout)
> maW(data)

produces a nice lovely vector of my weights, so far so good.  By
chance, the first column was the one with all 1's - I think this is
significant.

> data.norm = maNorm(data, norm = "printTipLoess")

This works great and just produces normalised data as if maW didn't
exist - we expect this from the code, maNorm() function does not use
weights.

Now:

> data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA",
y="maM", z="maPrintTip", w=data at maW)))

is my big hope.  And it doesn't throw any errors :-D.  However, it does
just produce M values as if maW doesn't exist.  I am about to throw in
the towel when I think I should try something.  So I try:

> data.weight.norm = maNormMain(data[,1], f.loc =
list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,1]@maW)))

This again turns up the now familiar M values, unaffected by maW.  But
of course, in my first data set maW is all set to 1, so of course thats
what it SHOULD produce.  So i try:

> data.weight.norm = maNormMain(data[,2], f.loc =
list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,2]@maW)))

and guess what?  It works!  Hurrah!  My M values have been affected by
maW, they are different to normal and I can only assume maNormMain is
calculating weighted normalised M values according to maW.

But wait - isn't this a little incorrect?  The marrayRaw class allows
me to have different weights for different spots for all of my arrays. 
So why when I normalise using maNormMain() do I have to do it on an
array-by-array basis?  Surely:

> data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA",
y="maM", z="maPrintTip", w=data at maW)))

should work in that when it is normalising the nth marrayRaw data set
in "data", it should use the nth set of weights in data at maW...?  Instead
what it appears to have done is take the first column of data at maW, and
by chance in my case that was all 1's so I noticed.  If those hadn't
been 1's but had been legitimate weights for the first array, I don't
think I would have noticed.... and had all of my arrays normalised
according to weights for the first array.... :-(

Anyway, I believe I have cracked it now in that I can weight normalise
all ninety of my arrays.  The fact that i have to make 90 calls to
maNormMain and prodcue 90 normalised data sets is a nuisance rather than
anything else, though I do believe what i have said above makes sense, I
hope someone agrees :-)  In most other respects the marray* classes, and
bioconductor in general, are fantastic, so I hope I don't appear
unappreciative ;-)

Thanks
Mick

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch 
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

______________________________________________________
The contents of this e-mail are privileged and/or confidential to the
named recipient and are not to be used by any other person and/or
organisation. If you have received this e-mail in error, please notify 
the sender and delete all material pertaining to this e-mail.