[BioC] Removing probes before or after normalization

Jenny Drnevich drnevich at uiuc.edu
Fri Apr 21 20:59:32 CEST 2006


Hi Daniel,

I have been wondering about this myself recently. I think all examples of 
filtering genes that I have seen do the filtering after the pre-processing 
steps, which is what I routinely do. I don't think I've seen a formal 
argument for this anywhere, but it seems that genes that are "Absent" (Affy 
calls) from all arrays and/or genes that have little variation across 
arrays (although I don't personally filter on this) are a part of those 
genes that do not change expression with treatment. Given that most 
normalization methods assume that most genes are not changing, you would 
not want to remove a portion of these genes before normalization, else you 
are increasing the proportion of genes that do change and perhaps 
decreasing the efficacy of the normalization? On the other hand, I have 
also worked with Affy's soybean chips, which have probe sets from two other 
species (pests, I believe) in addition to soybeans. In this case, we 
removed the non-soybean genes before pre-processing, mostly because we were 
running into memory problems. I hope we are not being arbitrary in removing 
non-species-of-interest genes before normalization and then filtering 
species-specific genes after normalization using different criteria! Any 
other thoughts?

Cheers,
Jenny

At 01:34 PM 4/21/2006, Bornman, Daniel M wrote:

>Dear BioC,
>
>I have a cutom chip with multiple microbial organisms but I am currently
>only interested in the results for one of these.  At what step in the
>analysis process is it adviced to remove the other organisms from
>analysis.  I worry that probes specific to those 'other' organisms may
>contribute to the background noise.  In that case maybe I should remove
>them prior to normalization and background correction.  Otherwise, maybe
>prior to independent testing and p-value adjustment. And, if not there,
>then prior to annotation.
>
>
>Thank You,
>
>Daniel Bornman
>Researcher
>Battelle Memorial Institute
>505 King Ave
>Columbus, OH 43201
>614.424.3229
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu



More information about the Bioconductor mailing list