[BioC] RMA normalization when using subsets of samples

Wed Feb 15 04:04:39 CET 2006

I only wish that Wolfgang's answer matched my experience.  It does 
seem to matter.

I don't think there is a statistical answer to your question, but as 
a statistician, I do feel more comfortable preprocessing all together.

--Naomi

At 07:01 PM 2/14/2006, Adaikalavan Ramasamy wrote:
>This would be a problem if one or more of the resulting subsets is small
>and contains outliers.
>
>My preference is to preprocess all arrays together. My reasoning is that
>doing this will give RMA median polish (and to a lesser extent with the
>quantile normalisation) steps much more information to work with.
>
>Regards, Adai
>
>
>
>
>On Wed, 2006-02-15 at 17:16 +0000, Wolfgang Huber wrote:
> > Dear Sylvia,
> >
> > this might not be the answer that you want to hear, but for the end
> > result it should not matter (substantially) which of the two
> > possibilities you take, and I would be worried if it did.
> >
> > Best wishes
> >   Wolfgang
> >
> > -------------------------------------
> > Wolfgang Huber
> > European Bioinformatics Institute
> > European Molecular Biology Laboratory
> > Cambridge CB10 1SD
> > England
> > Phone: +44 1223 494642
> > Fax:   +44 1223 494486
> > Http:  www.ebi.ac.uk/huber
> > -------------------------------------
> >
> > Sylvia.Merk at ukmuenster.de wrote:
> > > Dear bioconductor list,
> > >
> > > I have a question concerning RMA-normalization:
> > >
> > > There are for example 200 CEL-Files and the clinicians have several
> > > research questions - each concernig only a subset of the 200 samples
> > > including the possibility that some samples are included in more than
> > > one question.
> > >
> > > There are two possibilities to normalize the CEL-Files:
> > >
> > > 1.: Read all 200 chips in an affybatch-object and normalize all 200
> > > chips together and further analyze the required subset.
> > >
> > > 2.: Read only the required chips in an affybatch-object, normalize these
> > > chips and then perform further analysis
> > > I think that this approach is the better one but it has the disadvantage
> > > that some samples are included in several normalizations ending in
> > > different gene expression levels for a single sample.
> > >
> > > What is (from a statisticians view) the appropriate approach to
> > > normalize CEL-Files in this case?
> > >
> > > Thank you in advance
> > > Sylvia
> > >
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111