[BioC] GSMList problem

Sean Davis sdavis2 at mail.nih.gov
Sun Sep 7 13:07:17 CEST 2008


On Sun, Sep 7, 2008 at 6:55 AM, hemant ritturaj
<ritturajhemant at gmail.com> wrote:
> Dear All,
>
> I was trying to retrieve GSE record from NCBI GEO using GEOquery package.
>
> The packge was downloaded using
>  > gse6901 <- getGEO("GSE6901")
>
>> show(gse6901)
>
> $GSE6901_series_matrix.txt.gz
> ExpressionSet (storageMode: lockedEnvironment)
> assayData: 57381 features, 12 samples
>  element names: exprs
> phenoData
>  sampleNames: GSM159259, GSM159260, ..., GSM159270  (12 total)
>  varLabels and varMetadata description:
>    title: NA
>    geo_accession: NA
>    ...: ...
>    data_row_count: NA
>    (35 total)
> featureData
>  featureNames: AFFX-BioB-3_at, AFFX-BioB-5_at, ..., RPTR-Os-XXU09476-1_at
> (57381 total)
>  fvarLabels and fvarMetadata description:
>    ID: NA
>    GB_ACC: NA
>    ...: ...
>    Gene.Ontology.Molecular.Function: NA
>    (16 total)
>  additional fvarMetadata: Column, Description
> experimentData: use 'experimentData(object)'
> Annotation: GPL2025
>
> When i used
>
>  >gsmplatforms <- lapply(GSMList(gse), function(x) {
> + Meta(x)$platform
> + })
>
> It gave me an error
>
> Error in function (classes, fdef, mtable)  :
>  unable to find an inherited method for function "GSMList", for signature
> "list"
>
> I checked my R and Bioconductor versions which is as follows, and it seems
> fine as they are latest
>
>
>> sessionInfo()
> R version 2.7.1 (2008-06-23)
> i386-redhat-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] limma_2.14.6   GEOquery_2.4.5 RCurl_0.9-4    Biobase_2.0.1
>
> I am really not able to understand what is the error and how to solve it. If
> anyone has the solution please let me know. I shall be very thankful

Hello, Hemant.  Sorry for the confusion.  Starting with the latest
release (bioc 2.2) , getGEO uses a default of GSEMatrix=TRUE (see help
page for getGEO).  This results in much faster parsing of GSE
information than using the GSEMatrix=FALSE, which was the older
behavior.  Using GSEMatrix=TRUE returns a list of ExpressionSet
objects.  In your case, to get the first ExpressionSet, you can do:

gse6901.eset <- gse6901[[1]]
# to get the sample names (GSM accessions)
sampleNames(gse6901)

Since gse6901.eset is an ExpressionSet, it will generally work with
other Bioconductor tools very nicely (no need to do all that manual
conversion, etc., that was required without GSEMatrix=TRUE).  Also,
there is extensive documentation on using ExpressionSets via Biobase.

If you do, for some reason, need the older behavior, simply specifying
GSEMatrix=FALSE in the getGEO call will get that behavior back.
However, I would suggest moving to the faster, cleaner GSEMatrix
parsing if possible.

Sean



More information about the Bioconductor mailing list