[BioC] GEOquery and Sample Subsets

Thomas H. Hampton Thomas.H.Hampton at dartmouth.edu
Tue Jun 4 20:02:17 CEST 2013


This looks totally cool.

Is there a place where one can view the schema of the relational db?

In any case -- Thanks tons!

Tom



________________________________________
From: seandavi at gmail.com [seandavi at gmail.com] on behalf of Sean Davis [sdavis2 at mail.nih.gov]
Sent: Tuesday, June 04, 2013 1:19 PM
To: Thomas H. Hampton
Cc: bioconductor at r-project.org; Jack zhu
Subject: Re: [BioC] GEOquery and Sample Subsets

On Tue, Jun 4, 2013 at 1:14 PM, Thomas H. Hampton
<Thomas.H.Hampton at dartmouth.edu> wrote:
> Exactly!

This might help:

http://www.bioconductor.org/packages/release/bioc/html/GEOmetadb.html

Let us know if you have questions.

Sean


> Thanks.
>
> ________________________________________
> From: seandavi at gmail.com [seandavi at gmail.com] on behalf of Sean Davis [sdavis2 at mail.nih.gov]
> Sent: Tuesday, June 04, 2013 12:54 PM
> To: Thomas H. Hampton
> Cc: bioconductor at r-project.org
> Subject: Re: [BioC] GEOquery and Sample Subsets
>
> On Tue, Jun 4, 2013 at 12:38 PM, Thomas H. Hampton
> <Thomas.H.Hampton at dartmouth.edu> wrote:
>> I am using to GEOquery to establish sample subsets of GEO data -- that is, I would
>> like to know which samples are replicates.
>>
>> I am doing it something like this:
>>
>> gds505 <- getGEO("GDS505")
>> Columns(gds505)
>>
>>> str(Columns(gds505))
>> 'data.frame': 17 obs. of  4 variables:
>>  $ sample       : Factor w/ 17 levels "GSM11805","GSM11814",..: 2 4 5 7 9 10 12 14 16 1 ...
>>  $ disease.state: Factor w/ 2 levels "normal","RCC": 2 2 2 2 2 2 2 2 2 1 ...
>>  $ individual   : Factor w/ 10 levels "001","005","011",..: 6 4 1 2 3 5 8 9 10 6 ...
>>  $ description  : chr  "Value for GSM11814: C035 Renal Clear Cell Carcinoma U133A; src: Trizol...
>>
>> The problem I have is that the getGEO command retrieves a rather large object:
>>
>>> print(object.size(gds505), units="Mb")
>> 12.6 Mb'
>>
>> This takes up a lot of time and bandwidth if you plan to do it for thousands of accessions.
>>
>> Is there a way to retrieve less?
>
> Hi, Tom.  Are you saying that you really want just the metadata to
> start; in other words, you just want the sample information without
> the expression values?
>
> Sean
>
>
>> I am happy to use R, BioConductor, bioperl or whatever.
>>
>> Best,
>>
>> Tom
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


More information about the Bioconductor mailing list