[BioC] Affymetrix data double normalisation
shirley0818 at gmail.com
Tue Sep 25 17:19:46 CEST 2012
I kindly have a similar question: how to analyze two large affymetrix
gene expression datasets.
I have >2,000 affymetrix data in a relatively old groups. These data
have been normalized a year ago, which took a lot of efforts
(miss/mixed sample correction, quality check, etc.)
Then recently, we got another 3000 data on the same affymetrix
platform, but in a relatively younger group. These data have been
normalized separately from the previous data.
Now, my question is if I would like to analyze these data together
(>5,000 samples), what are your suggestions? Two possible ways that I
can think of are the following:
1. Re-normalize all of these 5,000 samples all together
2. double normalize the two datasets, for example,
standard-transformation (z-score) or global median normalization for
each dataset, then group them together for the down-stream statistical
Thanks in advance for your help,
On Tue, Sep 25, 2012 at 10:21 AM, James W. MacDonald <jmacdon at uw.edu> wrote:
> Hi Jun,
> On 9/25/2012 7:11 AM, Jun Han [guest] wrote:
>> I would like to use gcrma to do a within group normalization first (30
>> groups in total), then input all the normalised 30 groups to do another
>> global gcrma.
>> Is this possible? Does the gcrma accept the inputs from the first
>> normalisation output?
> The short answer is no. When you run gcrma(), you do background correction,
> normalization, and finally summarization of the probe-level data, resulting
> in probeset-level data. In other words, you are taking the PM probes and
> summarizing them into a single value at the probeset level (after background
> correcting and normalizing).
> Since gcrma() expects you to be inputting an AffyBatch containing PM and MM
> probe data, it fails when you input an ExpressionSet containing summarized
> probeset level data.
> I assume you are trying to combine two groups that you think should not be
> normalized and summarized together. This leads to two questions - first, why
> don't you think these data can be combined prior to the gcrma() step, and
> second, if the answer to the first question is because of a batch effect,
> have you looked at e.g., sva or comBat?
>> Many thanks.
>> -- output of sessionInfo():
>> Error in function (classes, fdef, mtable) :
>> unable to find an inherited method for function "indexProbes", for
>> signature "ExpressionSet", "character"
>> Sent via the guest posting facility at bioconductor.org.
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> Search the archives:
> James W. MacDonald, M.S.
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
> Bioconductor mailing list
> Bioconductor at r-project.org
> Search the archives:
More information about the Bioconductor