[BioC] Normalization of array data from GEO repository

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue Jul 7 19:59:19 CEST 2009


Hi,

On Jul 7, 2009, at 5:38 AM, Aleš Maver wrote:

> Hi all,
> I have obtained several GEO Series (GSE) entries from GEO repository  
> using
> getGEO function (GEOquery package).
> Data obtained in this manner is stored in ExpressionSet class. The  
> problem
> is I don't know how to perform quality control analyses and  
> normalization
> procedures on ExpressionSet data, because functions like expresso  
> (affy
> package) work only on AffyBatch classes. Is there anything I am  
> missing?

Sorry, I've never used the GEOquery package before, so I can't speak  
much to that, but I'd be surprised if there isn't an option to return  
your results as an AffyBatch object, because I'd dare say that you can  
get most of the data from geo in its raw format (eg, CEL file or  
whatever).

> And- does anyone know whether data in GEO repository is already  
> normalised
> or not?

It depends, sometimes you aren't given the raw files: sometimes the  
data is from a custom array, or I've also seen some datasets provided  
in the post-processed form (already MAS5 normalized, for example), but  
it's been my experience that you can get the raw data for most of the  
experiments you find there.

Also, for array quality assessment, look into the arrayQualityMetrics  
package:

http://www.bioconductor.org/packages/release/bioc/html/arrayQualityMetrics.html

Hope that helps,
-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list