[BioC] RMA vs gcRMA on 2 groups of samples

Sat Nov 3 03:10:11 CET 2007

Thanks everyone for the precious input ! I appreciate it !

Bogdan

-- 
Bogdan Tanasa, MD
Kellogg School of Science and Technology,
The Scripps Research Institute,
10550 North Torrey Pines Road,
La Jolla, California 92037

On 11/2/07, Robert Gentleman <rgentlem at fhcrc.org> wrote:

    Hi,
       If they were assayed at approximately the same time, using
    approximately the same protocols then yes, one normalization is likely
    to be better than two. I think that there may also be issues if the set
    of genes that are expressed is very different in the different tissue
    types (as them being the same is one of the basic assumptions in most
    normalization methods). But if very much is different, then it is better
    not to try and normalize, but rather to adjust after normalization.

    best wishes
       Robert

    James W. MacDonald wrote:
    > Yes but if I am not mistaken, the OP had a situation in which the
    > samples were simply different cell or tissue types, rather than
    > different batches. I this case I would favor normalizing all together
    > rather than doing things in batches.
    >
    > Best,
    >
    > Jim
    >
    >
    > Robert Gentleman wrote:
    >>
    >> Naomi Altman wrote:
    >>> Dear Bogdan,
    >>> Any normalization method that uses a set of arrays, reduces the
    >>> variability among those arrays.
    >>>
    >>> So, if you have 2 sets of arrays and normalize separately, you will
    >>> find that the within set variability is smaller than the between set
    >>> variability - i.e. you induce significant differential expression
    >>> simply by the normalization.  To avoid this effect, when you are
    >>> doing differential expression analysis (or sample clustering) you
    >>> must either use methods that normalize each array separately (MAS) or
    >>> normalize all together.
    >>
    >>   An alternative (and the one that I prefer) is to do separate
    >> normalizations, and to then use some sort of batch effect term in the
    >> model used to assess differentially expressed genes.
    >>
    >>   Normalization is intended to clean up the relatively minor issues
    >> that arise due to slightly different conditions etc. for arrays that
    >> are essentially the same.  As far as I can see it is not intended to
    >> adjust for batch effects, and in my experience generally does a bad
    >> job of that.  Just because you can normalize (or fit any statistical
    >> model) does not mean that you should.
    >>
    >>    best wishes
    >>      Robert
    >>
    >>
    >>> --Naomi
    >>>
    >>> At 12:01 PM 11/2/2007, Bogdan Tanasa wrote:
    >>>> Greetings Naomi,
    >>>>
    >>>> thanks for reply. To generalize my question: when dealing with 2
    >>>> sets of
    >>>> samples, let's say  X1, X2, ...., Xn  and  Y1, Y2, ..., Yn,
    >>>> I could run the normalization in 2 ways: A. only X(1,n) and only
    >>>> Y(1,n), or
    >>>> B. both X(1,n),Y(1,n). Are there any a priori statistical
    >>>> criteria that favors a way or the other ? If I  would take into
    >>>> consideration biological criteria (the things I am interested
in), the
    >>>> results
    >>> >from A may sometimes look better than B', or vice versa. Thanks !
    >>>> Bogdan
    >>>>
    >>>>
    >>>>
    >>>> On 11/2/07, Naomi Altman <naomi at stat.psu.edu> wrote:
    >>>>> Dear Bogdan,
    >>>>> I do not have an opinion on gcRMA versus RMA.  But if you are doing
    >>>>> differential expression analysis comparing the cell samples with
the
    >>>>> organ samples, you need to normalize
    >>>>> all the samples together.
    >>>>>
    >>>>> --Naomi
    >>>>>
    >>>>> At 11:31 AM 11/1/2007, Bogdan Tanasa wrote:
    >>>>>> Hi folks,
    >>>>>>
    >>>>>> I would like to ask for your opinions on the following:
    >>>>>>
    >>>>>> I have 60 expression profiles of 60 samples (cells and organs in
    >>>>>> resting conditions).
    >>>>>> I normalized these arrays in many ways, including RMA.
    >>>>>>
    >>>>>> Considering the biological arguments (cells samples vs organs
    >>>>>> samples), I am planning to do the normalization separately, on the
    >>>>>> group of cell samples, and on the group of organ samples.
    >>>>>>
    >>>>>> My questions are:
    >>>>>>
    >>>>>> - after RMA normalization on separate groups of samples (cells vs
    >>>>>> organs), the results are different, but are these better ? GO
    >>>>>> analysis
    >>>>>> do not display major differences.
    >>>>>>
    >>>>>> - would gcRMA work better than RMA ? The majority of opinions in
    >>>>>> SoCal
    >>>>>> are pro-RMA.
    >>>>>>
    >>>>>> thanks,
    >>>>>>
    >>>>>> Bogdan