[BioC] Normalized microarray data and meta-analysis

Mayer, Claus-Dieter c.mayer at abdn.ac.uk
Thu Dec 18 00:42:35 CET 2008

Dear Kevin,

that is a difficult question indeed. I am not sure what type of microarrays we are talking about here, but if it were Affy arrays then normalisation methods like RMA or GCRMA perform an "across array" normalisation step, i.e. the normalised data from the same study will be more similar to each other than the ones from different studies. So for a better comparibility across studies it seems better to normalise the raw arrays from all studies together.

Having said that, even if you are able to do this you will typically find that the data from the different studies cluster together, i.e. the normalisation is not able to remove all the differences between studies. So any proper meta analysis must somehow take into account this study effect (and there is a growing amount of literature how to do that).The importance of having the raw data depends on which approach you take; if you use a p-value comination approach like Stouffers method for example it shouldn't matter much for example, but if you try to put all data into one big analysis it might very well matter.

Best Wishes


Hello Bioconductor-inos,

I have more of a statistical/philosophical question regarding using raw
vs. normalized data in a microarray meta-analysis.  I've looked through
the bioconductor archives and have found some addressing of this issue,
but not exactly what I'm concerned with.  I don't mean to waste anyone's
time, but I was hoping I could get some help here.

I've performed a meta-analysis using the downloaded data from 3
different GEO data sets (GDS).  It is my understanding that these are
normalized data from the various microarray experiments.  Seems to me
that the  data from those normalized results are normally distributed,
those three experiments are perfectly comparable (if you think the
author's respective normalization approaches  were reasonable).  All you
need to do is calculate some sort of effect size/determine a
p-value/etc. for all genes in the experimental conditions of interest
and then combine these statistics across the different experiments.
However, I consistently read things like "raw data are required for a
microarray meta-analysis."  Does this mean that normalized data are not
directly comparable with eachother?  If so, then why does GEO even host
such data?

Any help would be wonderful!


K. Wyatt McMahon, Ph.D.

Texas Tech University Health Sciences Center

Department of Internal Medicine

3601 4th St.

Lubbock, TX - 79430


"It's been a good year in the lab when three things work. . . and one of
those is the lights." - Tom Maniatis

        [[alternative HTML version deleted]]

Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

The University of Aberdeen is a charity registered in Scotland, No SC013683.

More information about the Bioconductor mailing list