[BioC] Question about patchwork affy pre-processing

Mon Jun 11 22:01:56 CEST 2007

Hi:

I'm involved in an experiment using affy hgu133 plus2 arrays.
I have affy, gcrma, and other relevent libraries up and running 
on my linux system. 

I preprocessed using the 'threestep' function in the 
gcrma library, using the following settings

normalize.method = "quantile.robust"
summary.method = "median.polish"
background.method = "GCRMA"

My question is this.  Someone suggested that their biostatistician
usually preprocesses via RMA and then merges MAS-5 present/absent
calls into the resulting dataframe, which are used to omit genes with MAS-5 
absent calls from any further analysis.  

My feeling is that MAS-5.0 is inferior on the three steps mentioned above,
and if present/absent calls are based upon inferior techniques they should not 
be used.  I also believe that people are moving away from what I view as 
a hidden level of filtering.  It is my belief that the best way to do 
filtering is once at the stage of the analysis.

Am I right in thinking that this is a bad idea.  

Grant Izmirlian

I have followed the debate on pm only and in my mind the developement of GCRMA 
now allows an efficient way to model mm's so that background correction can be 
done without doubling the per gene noise. 

So normalization 

Definitely the normalization, background correction and summary methods of
'three-step' are all the result of research that has applied the best 
statistical principles in lieu of rather ad-hoc techniques contained in 
MAS-5.

suceeded in refining the methods of MAS-5
My read of the literature and best practice tells me that this is not really a 
preferable way to do things
-- 
Հրանդ Իզմիրլյան