[BioC] load and normalize arrays from different platform

Henrik Bengtsson hb at biostat.ucsf.edu
Wed Feb 23 18:23:51 CET 2011


Just a follow up:

The 'HG-U133A_2' chip type is a *physically different* array than the
'HT_HG-U133A' chip type (aliased 'U133AAofAv2' during its early-access
stage).  For instance, the former has 732x732 probes whereas the
latter has 744x744 probes, cf.

  http://www.aroma-project.org/chipTypes/

In other words, the problem is *not* just about different chip type
*aliases* (as it would have been if you had CEL files labelled
'HT_HG-U133A' and ''U133AAofAv2' when you theoretically can treat all
to be of the same chip type).

You need to proceed as others have already suggested in this thread.

/Henrik

On Tue, Feb 22, 2011 at 5:02 PM, Moshe Olshansky <olshansky at wehi.edu.au> wrote:
> Hi Wendy,
>
> Just one more comment: since you are using two different platforms you
> should expect batch effect, so it is important to normalize the two
> batches together (and there are several ways of doing this). Even then
> check your normalized expression values to see whether you still have
> batch effect.
>
> Moshe.
>
>> Hi Moshe and James,
>>
>> Thank you very much for your suggestions. They make sense. I will do that.
>>
>> Regards,
>> Wendy
>>
>>
>>
>> On 22 February 2011 19:27, Moshe Olshansky <olshansky at wehi.edu.au> wrote:
>>
>>> If you wish to normalize them together you can do what Jim suggested but
>>> without normalization, get the two expression matrices, combine them
>>> (using
>>> common genes only) and then use normalization functions from limma to
>>> normalize the two sets together. Regards, Moshe. > Hi Wendy, > On
>>> 2/22/2011
>>> 3:13 PM, Wendy Qiao wrote: >> Hi all, >> I need to load and normalize
>>> CEL
>>> files from two different platforms, one >> platform is *U133AAofAv2
>>> (22944
>>> affyids)* and the other is *HG-U133A_2 >> (22277 affyids)*. I believe
>>> that
>>> these two platforms have very similar >> annotations. > They may have
>>> similar annotations, but you won't be able to load and > normalize
>>> together.
>>> You are better off normalizing separately, and then > if you need to
>>> analyze
>>> together, you can attempt to subset to the > intersecting probesets and
>>> then
>>> do the analysis. > Best, > Jim >> When I read all the file together
>>> using
>>> ReadAffy, I got an error saying, >>>
>>> es.affy<-ReadAffy(filenames=celfile,
>>> celfile.path=celpath, >>> phenoData=NULL) >> Error in
>>> read.affybatch(filenames = l$filenames, phenoData = >> l$phenoData, >>
>>> :
>>> >>    Cel file XX does not seem to have the correct dimensions >> I
>>> figure
>>> that is because two platform has different cdf. So I tried to >> change
>>> the
>>> cdf name for *U133AAofAv2 *using library("affxparser"). The I >> got >>
>>> the
>>> following errors, >>> convertCel(celfile, celfile.output,
>>> newChipType="HG-U133A_2") >> Error in
>>> .unwrapDatHeaderString(header$DatHeader) : >>    Internal error: Failed
>>> to
>>> extract 'pixelRange' and 'sampleName' from >> DAT >> header.  They
>>> became
>>> identical:   HG-U133A_2.1sq  >> I am not sure how to get around with
>>> this
>>> problem? Could anybody helps? >> Or >> what would be the best way to
>>> normalize two datasets like mine? Thank >> you >> very much. Any
>>> suggestion
>>> is appreciated. >> Thank you very much, >> Wendy [[alternative HTML
>>> version
>>> deleted]] >> _______________________________________________ >>
>>> Bioconductor
>>> mailing list >> Bioconductor at r-project.org >>
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the
>>> archives:
>>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >
>>> --
>>> > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University
>>> of
>>> Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine
>>> St.
>>> > Ann Arbor MI 48109-5618 > 734-615-7826 >
>>> ********************************************************** > Electronic
>>> Mail
>>> is not secure, may not be read every day, and should not > be used for
>>> urgent or sensitive issues >
>>> _______________________________________________
>>> > Bioconductor mailing list > Bioconductor at r-project.org >
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the
>>> archives:
>>> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> Moshe
>>> Olshansky Division of Bioinformatics The Walter & Eliza Hall Institute
>>> of
>>> Medical Research 1G Royal Parade, Parkville, Vic 3052 e-mail:
>>> olshansky at wehi.edu.au tel: (03) 9345 2697
>>> ______________________________________________________________________
>>> The
>>> information in this email is confidential and intended solely for the
>>> addressee. You must not disclose, forward, print or use it without the
>>> permission of the sender.
>>> ______________________________________________________________________
>>
>
>
>
> ______________________________________________________________________
> The information in this email is confidential and intend...{{dropped:4}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list