[BioC] comparing different batches of data directly

James W. MacDonald jmacdon at med.umich.edu
Fri Dec 8 15:07:16 CET 2006

Hi Sabine,

Sabine Reichelt wrote:
> Hi!
> What would be the most appropriate approach if I want to compare gene
> expression data from different laboratories (and different biological
> sources) directly? Assuming the data were profiled on the same chip,
> of course. What kind of normalization (in batches? all together?) and
> subsequent processing would be "least harmful"?

This depends on what you mean by comparing things 'directly'. If you 
mean that you have some controls from lab 1 and some experimentals from 
lab 2 that you want to compare, then it doesn't really matter what you 
do because you won't be able to control for the 'lab' effect. In other 
words, you won't ever be able to determine if a given change is due to 
Biological differences or simply technical variability due to being run 
in different labs.

On the other hand, if you have microarray data for both sample types 
that were run in two different labs (i.e., control and experimental 
samples from lab 1 and control and experimental samples from lab 2), 
then you would want to normalize the data from each lab in separate 
batches and then compare using a mixed model. The GeneMeta package in 
the devel repository is designed to do this sort of thing. 
Alternatively, you could use something like lme() in the nlme package on 
a row-wise basis (this would be slow however).



> Thanks for any answers! Sabine

James W. MacDonald, M.S.
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109

Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

More information about the Bioconductor mailing list