[BioC] Affymetrix data double normalisation
James W. MacDonald
jmacdon at uw.edu
Tue Sep 25 17:21:29 CEST 2012
A third option is to use fRMA, which is designed specifically for the
situation you are in right now.
On 9/25/2012 11:19 AM, shirley zhang wrote:
> Hi Jim,
> I kindly have a similar question: how to analyze two large affymetrix
> gene expression datasets.
> I have>2,000 affymetrix data in a relatively old groups. These data
> have been normalized a year ago, which took a lot of efforts
> (miss/mixed sample correction, quality check, etc.)
> Then recently, we got another 3000 data on the same affymetrix
> platform, but in a relatively younger group. These data have been
> normalized separately from the previous data.
> Now, my question is if I would like to analyze these data together
> (>5,000 samples), what are your suggestions? Two possible ways that I
> can think of are the following:
> 1. Re-normalize all of these 5,000 samples all together
> 2. double normalize the two datasets, for example,
> standard-transformation (z-score) or global median normalization for
> each dataset, then group them together for the down-stream statistical
> Thanks in advance for your help,
> On Tue, Sep 25, 2012 at 10:21 AM, James W. MacDonald<jmacdon at uw.edu> wrote:
>> Hi Jun,
>> On 9/25/2012 7:11 AM, Jun Han [guest] wrote:
>>> I would like to use gcrma to do a within group normalization first (30
>>> groups in total), then input all the normalised 30 groups to do another
>>> global gcrma.
>>> Is this possible? Does the gcrma accept the inputs from the first
>>> normalisation output?
>> The short answer is no. When you run gcrma(), you do background correction,
>> normalization, and finally summarization of the probe-level data, resulting
>> in probeset-level data. In other words, you are taking the PM probes and
>> summarizing them into a single value at the probeset level (after background
>> correcting and normalizing).
>> Since gcrma() expects you to be inputting an AffyBatch containing PM and MM
>> probe data, it fails when you input an ExpressionSet containing summarized
>> probeset level data.
>> I assume you are trying to combine two groups that you think should not be
>> normalized and summarized together. This leads to two questions - first, why
>> don't you think these data can be combined prior to the gcrma() step, and
>> second, if the answer to the first question is because of a batch effect,
>> have you looked at e.g., sva or comBat?
>>> Many thanks.
>>> -- output of sessionInfo():
>>> Error in function (classes, fdef, mtable) :
>>> unable to find an inherited method for function "indexProbes", for
>>> signature "ExpressionSet", "character"
>>> Sent via the guest posting facility at bioconductor.org.
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> Search the archives:
>> James W. MacDonald, M.S.
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> Search the archives:
James W. MacDonald, M.S.
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor