[BioC] Defining Weights in marrayNorm.

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Wed Aug 13 16:01:44 MEST 2003


Hi

In my continuing search to get this working, I have made progress :-D  But I think I found a bug/feature...

OK here is what I did. 

I took a GenePix file.  I made a copy of it.  I added column "SpotWeight" to both.  In one of the files I set the weights all to 1.  In the other, I set all of the weights to be between 0 and 0.5 (random numbers).  I just wanted to see if I could get it working.

So:

> data = read.GenePix(fnames = files, name.Gf = "F532 Median", name.Gb = "B532 Median", name.Rf = "F635 Median", name.Rb = "B635 Median", name.W = "SpotWeight", layout=layout)
> maW(data)

produces a nice lovely vector of my weights, so far so good.  By chance, the first column was the one with all 1's - I think this is significant.

> data.norm = maNorm(data, norm = "printTipLoess")

This works great and just produces normalised data as if maW didn't exist - we expect this from the code, maNorm() function does not use weights.

Now:

> data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data at maW)))

is my big hope.  And it doesn't throw any errors :-D.  However, it does just produce M values as if maW doesn't exist.  I am about to throw in the towel when I think I should try something.  So I try:

> data.weight.norm = maNormMain(data[,1], f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,1]@maW)))

This again turns up the now familiar M values, unaffected by maW.  But of course, in my first data set maW is all set to 1, so of course thats what it SHOULD produce.  So i try:

> data.weight.norm = maNormMain(data[,2], f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,2]@maW)))

and guess what?  It works!  Hurrah!  My M values have been affected by maW, they are different to normal and I can only assume maNormMain is calculating weighted normalised M values according to maW.

But wait - isn't this a little incorrect?  The marrayRaw class allows me to have different weights for different spots for all of my arrays.  So why when I normalise using maNormMain() do I have to do it on an array-by-array basis?  Surely:

> data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data at maW)))

should work in that when it is normalising the nth marrayRaw data set in "data", it should use the nth set of weights in data at maW...?  Instead what it appears to have done is take the first column of data at maW, and by chance in my case that was all 1's so I noticed.  If those hadn't been 1's but had been legitimate weights for the first array, I don't think I would have noticed.... and had all of my arrays normalised according to weights for the first array.... :-(

Anyway, I believe I have cracked it now in that I can weight normalise all ninety of my arrays.  The fact that i have to make 90 calls to maNormMain and prodcue 90 normalised data sets is a nuisance rather than anything else, though I do believe what i have said above makes sense, I hope someone agrees :-)  In most other respects the marray* classes, and bioconductor in general, are fantastic, so I hope I don't appear unappreciative ;-)

Thanks
Mick



More information about the Bioconductor mailing list