[BioC] Affymetrix data double normalisation

shirley zhang shirley0818 at gmail.com
Tue Sep 25 17:23:29 CEST 2012


Great. Thanks Jim. I will check fRMA.  Shirley

On Tue, Sep 25, 2012 at 11:21 AM, James W. MacDonald <jmacdon at uw.edu> wrote:
> Hi Shirley,
>
> A third option is to use fRMA, which is designed specifically for the
> situation you are in right now.
>
> Best,
>
> Jim
>
>
>
>
> On 9/25/2012 11:19 AM, shirley zhang wrote:
>>
>> Hi Jim,
>>
>> I kindly have a similar question: how to analyze two large affymetrix
>> gene expression datasets.
>>
>> I have>2,000 affymetrix data in a relatively old groups. These data
>> have been normalized a year ago, which took a lot of efforts
>> (miss/mixed sample correction, quality check, etc.)
>>
>> Then recently, we got another  3000 data on the same affymetrix
>> platform, but in a relatively younger group. These data have been
>> normalized separately from the previous data.
>>
>> Now, my question is if I would like to analyze these data together
>> (>5,000 samples), what are your suggestions? Two possible ways that I
>> can think of are the following:
>>
>> 1. Re-normalize all of these 5,000 samples all together
>> 2. double normalize the two datasets, for example,
>> standard-transformation (z-score) or global median normalization for
>> each dataset, then group them together for the down-stream statistical
>> analysis.
>>
>> Thanks in advance for your help,
>> Shirley
>>
>> On Tue, Sep 25, 2012 at 10:21 AM, James W. MacDonald<jmacdon at uw.edu>
>> wrote:
>>>
>>> Hi Jun,
>>>
>>> On 9/25/2012 7:11 AM, Jun Han [guest] wrote:
>>>>
>>>> Hi,
>>>> I would like to use gcrma to do a within group normalization first (30
>>>> groups in total), then input all the normalised 30 groups to do another
>>>> global gcrma.
>>>> Is this possible? Does the gcrma accept the inputs from the first
>>>> normalisation output?
>>>
>>>
>>> The short answer is no. When you run gcrma(), you do background
>>> correction,
>>> normalization, and finally summarization of the probe-level data,
>>> resulting
>>> in probeset-level data. In other words, you are taking the PM probes and
>>> summarizing them into a single value at the probeset level (after
>>> background
>>> correcting and normalizing).
>>>
>>> Since gcrma() expects you to be inputting an AffyBatch containing PM and
>>> MM
>>> probe data, it fails when you input an ExpressionSet containing
>>> summarized
>>> probeset level data.
>>>
>>> I assume you are trying to combine two groups that you think should not
>>> be
>>> normalized and summarized together. This leads to two questions - first,
>>> why
>>> don't you think these data can be combined prior to the gcrma() step, and
>>> second, if the answer to the first question is because of a batch effect,
>>> have you looked at e.g., sva or comBat?
>>>
>>> Best,
>>>
>>> Jim
>>>
>>>
>>>> Many thanks.
>>>> Jun
>>>>
>>>>    -- output of sessionInfo():
>>>>
>>>>> gcrma12<-gcrma(gcrma1,gcrma2)
>>>>
>>>> Error in function (classes, fdef, mtable)  :
>>>>     unable to find an inherited method for function "indexProbes", for
>>>> signature "ExpressionSet", "character"
>>>>
>>>> --
>>>> Sent via the guest posting facility at bioconductor.org.
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>> --
>>> James W. MacDonald, M.S.
>>> Biostatistician
>>> University of Washington
>>> Environmental and Occupational Health Sciences
>>> 4225 Roosevelt Way NE, # 100
>>> Seattle WA 98105-6099
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>



-- 
Xiaoling (Shirley) Zhang

M.D., Ph.D.
Boston University, Boston, MA
Tel: (857) 233-9862
Email: zhangxl at bu.edu



More information about the Bioconductor mailing list