[BioC] getGEO function to load files from other locations than GEO ?

Wolfgang Raffelsberger wraff at titus.u-strasbg.fr
Fri Jun 1 20:17:16 CEST 2007


Hi Sean,

as you suggested her the output from readLines() :
 > readLines('GSM180487.txt',n=10)
 [1] 
"TYPE\ttext\ttext\ttext\ttext\tinteger\tfloat\tfloat\ttext\ttext\ttext\tinteger\
 [2] 
"FEPARAMS\tProtocol_Name\tProtocol_date\tScan_Date\tScan_ScannerName\tScan_NumCh
 [3] "DATA\t44k_CGH_0605 (Editable)\t30-Jan-2006 18:01\t06-09-2006 
13:25:24\tAgilent
 [4] "*"
 [5] 
"TYPE\tfloat\tfloat\tfloat\tinteger\tfloat\tfloat\tfloat\tinteger\tfloat\tfloat\
 [6] 
"STATS\tgDarkOffsetAverage\tgDarkOffsetMedian\tgDarkOffsetStdDev\tgDarkOffsetNum
 [7] 
"DATA\t38.965\t39\t6.13591\t1000\t38.884\t39\t7.85039\t1000\t1.00937\t1.0098\t3\
 [8] "*"
 [9] 
"TYPE\tinteger\tinteger\tinteger\ttext\tinteger\ttext\tinteger\tinteger\ttext\tt
[10] 
"FEATURES\tFeatureNum\tRow\tCol\taccessions\tSubTypeMask\tSubTypeName\tProbeUID\

Most lines in the output above are very long (experiment meta-data), so 
I truncated since I believe you mainly want to see what kind of output I 
get...
Indeed, it doesn't look at all like the output you descibed.
Does this mean that when first downloading from GEO I get a different 
kind of format ?
Amazingly the direct way of accessing directly at GEO (without 
downloading first & trying to acess the local copy) works without any 
difficulty...

In the meantime I've managed to read tha data using read.maimages() from 
limma, so there's no more urgency to find a solution on this issue.
As I know too little about various GEO formats I'm afraid this may get 
too complicted... or I got across some bad example (here I'm not reading 
CGH data).
The route via getGEO()  might have been more elegant/flexible, though ...

Thank's for your help anyway,
Wolfgang


Sean Davis a écrit :
> Wolfgang Raffelsberger wrote:
>   
>> Dear list,
>>
>> Sorry to bug you again on the issue of using the "getGEO()" function
>> to load files from other locations than GEO...
>>
>> Sean Davis a écrit :
>>     
>>> See the help for getGEO.  There is a filename argument that does
>>> exactly what you are describing. 
>>> Sean
>>>   
>>>       
>> I tried to spicify the file using the filname-argument :
>>
>>     
>>> in.file <- "GSM180487.txt"             # just picking an example of
>>>       
>> an original GEO downloaded &decompressed file
>>     
>>> deGEO1 <- getGEO(filename=in.file )      # i.e., from the directory
>>>       
>> with my file...
>> Error in switch(as.character(first.entity[1]), sample = { :
>>        argument is missing, with no default
>>     
> Wolfgang,
>
> It looks like GSM180487 might not be a SOFT format file.  If you run
> this command:
>
> readLines('GSM180487.txt',n=10)
>
> you should get this:
>
>  [1] "^SAMPLE = GSM180487"
>  [2] "!Sample_title = ACC 1"
>  [3] "!Sample_geo_accession = GSM180487"
>  [4] "!Sample_status = Public on Apr 10 2007"
>  [5] "!Sample_submission_date = Apr 04 2007"
>  [6] "!Sample_last_update_date = Apr 09 2007"
>  [7] "!Sample_type = genomic"
>  [8] "!Sample_channel_count = 2"
>  [9] "!Sample_source_name_ch1 = ACC Tumor Sample 1"
> [10] "!Sample_organism_ch1 = Homo sapiens"
>
> If not, then you don't have a SOFT format file, most likely.  Let me
> know if you need more direction.
>
> Sean
>
>
>
>   


-- 

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. . .

Wolfgang Raffelsberger, PhD
Laboratoire de BioInformatique et Génomique Intégratives
IGBMC
1 rue Laurent Fries,  67404 Illkirch  Strasbourg,  France
Tel (+33) 388 65 3300         Fax (+33) 388 65 3276
http://www-bio3d-igbmc.u-strasbg.fr/~wraff
wolfgang.raffelsberger at igbmc.u-strasbg.fr



More information about the Bioconductor mailing list