[BioC] GEOquery on rawdata and processed data ?

Jenny Drnevich drnevich at uiuc.edu
Tue Jul 3 21:47:20 CEST 2007


CEL files contain the probe-level data, so by definition they contain 
'raw' data (no background correction, normalization or 
summarization). So CEL files never contain processed data...

Cheers,
Jenny

At 02:39 PM 7/3/2007, Saroj Mohapatra wrote:
>There are links to the .CEL files (I guess this would be "raw" files) at GEO.
>
>E.g., GSM72287 is part of the series GSE3218. At the bottom of the 
>page (below) there is a link under 'Supplementary files'.
>
>http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE3218
>
>HTH
>
>Saroj
>
>
>Alex Tsoi wrote:
>
>>I figure out that those are the RMA-processed data, so my question should be
>>how could I get the rawdata ?
>>
>>
>>On 7/3/07, Alex Tsoi <tsoi.teen at gmail.com> wrote:
>>
>>
>>>Dear all,
>>>
>>>I use the function getGEO from GEOquery to retrieve different cancer data
>>>sets from GEO to do a meta-analysis.
>>>
>>>However, I am not quite sure if the data I downloaded has already been
>>>processed (eg. RMA, or MAS) or not, is it true that all the
>>>.CEL might be processed while all the .EXP files are raw ?
>>>
>>>Also, if I assign as:
>>>
>>>
>>>
>>>>rawdata <- getGEO(GSM72287)
>>>>
>>>"rawdata" has the data table with column names ID_REF and VALUE:
>>>
>>>but are those processed or raw data values ?
>>>
>>>My main goal is to get the raw data values from each sample so I could do
>>>a meta analysis by applying my own processing
>>>methods.
>>>
>>>Below is showing the rawdata.
>>>
>>>Greatly appreciate for any help.
>>>
>>>
>>>
>>>An object of class "GSM"
>>>channel_count
>>>[1] "1"
>>>characteristics_ch1
>>>[1] "mixed GCT (Embryonal Carcinoma, Seminoma)"
>>>contact_address
>>>[1] "1275 York Ave"
>>>contact_city
>>>[1] "New York"
>>>contact_country
>>>[1] "USA"
>>>contact_department
>>>[1] "Cell Biology"
>>>contact_email
>>>[1] " korkolaj at mskcc.org"
>>>contact_institute
>>>[1] "Memorial Sloan-Kettering"
>>>contact_laboratory
>>>[1] "Chaganti"
>>>contact_name
>>>[1] "James,,Korkola"
>>>contact_phone
>>>[1] "212-639-8281"
>>>contact_state
>>>[1] "NY"
>>>contact_zip/postal_code
>>>[1] "10021"
>>>data_processing
>>>[1] "RMA (robust multi-array)"
>>>data_row_count
>>>[1] "22645"
>>>description
>>>[1] "Adult Male Germ Cell Tumor"
>>>extract_protocol_ch1
>>>[1] "Frozen tissue from a germ cell tumor was minced and homogenized in
>>>RLT buffer (Qiagen).Total RNA was extracted from the tissue lysate using an
>>>RNeasy kit (Qiagen)."
>>>geo_accession
>>>[1] "GSM72287"
>>>hyb_protocol
>>>[1] "standard Affymetrix procedures"
>>>label_ch1
>>>[1] "biotin"
>>>label_protocol_ch1
>>>[1] "Approximately 12 ug of total RNA was processed to produce
>>>biotinylated cRNA targets."
>>>last_update_date
>>>[1] "Oct 12 2005"
>>>molecule_ch1
>>>[1] "total RNA"
>>>organism_ch1
>>>[1] "Homo sapiens"
>>>platform_id
>>>[1] "GPL97"
>>>scan_protocol
>>>[1] "standard Affymetrix procedures"
>>>series_id
>>>[1] "GSE3218"
>>>source_name_ch1
>>>[1] "germ cell tumor"
>>>status
>>>[1] "Public on Nov 10 2005"
>>>submission_date
>>>[1] "Aug 29 2005"
>>>supplementary_file
>>>[1] "file:///samples/GSM72287/GSM72287.CEL.gz"
>>>[2] "file:///samples/GSM72287/GSM72287.EXP.gz"
>>>title
>>>[1] "germ cell tumors (GCT) and normal controls 052B 1"
>>>type
>>>[1] "RNA"
>>>An object of class "GEODataTable"
>>>****** Column Descriptions ******
>>>  Column                     Description
>>>1 ID_REF                              \t
>>>2  VALUE RMA-calculated Signal intensity
>>>****** Data Table ******
>>>       ID_REF     VALUE
>>>1 200000_s_at  9.913362
>>>2   200001_at  9.822533
>>>3   200002_at 11.318111
>>>4 200003_s_at 12.280321
>>>5   200004_at 11.068576
>>>22640 more rows ...
>>>
>>>
>>>
>>>--
>>>Lam C. Tsoi (Alex)
>>>Medical University of South Carolina
>>>
>>
>>
>>
>>
>>
>
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu



More information about the Bioconductor mailing list