[BioC] GEOquery::parseGEO throws error reading file

Gad Abraham gabraham at csse.unimelb.edu.au
Sat Aug 2 05:28:06 CEST 2008


Sean Davis wrote:
> On Sun, Jul 27, 2008 at 11:44 PM, Gad Abraham
> <gabraham at csse.unimelb.edu.au> wrote:
>> Hi,
>>
>> I'm using GEOquery 2.4.1 to read an NCBI GEO file
>> ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE4284/GSE4284_series_matrix.txt.gz
>>  but parseGEO throws an error. The switch argument evaluates to "0", which
>> doesn't match alternative, so it tries to match on the last empty argument
>> and fails. I don't know if this is related to the warnings; the file
>> contains text such as manufacturer\xa1\xafs which may not parse correctly.
>>
>> Below is the output.
>>
>> Thanks for any advice,
>> Gad
>>
>>> g <- getGEO(filename="GSE4284_series_matrix.txt")
> 
> Hi, Gad.  The filename argument does not yet take GSE series matrix
> files as an argument.  I have a couple of changes to make with GSE
> series matrix handling and adding file-based parsing is one of them.
> 
> Sean

Hi Sean,

Is this also the reason for scan() failing when using the GEO name 
instead of the filename? (See below)

Thanks,
Gad

 > g <- getGEO("GSE4284")
Found 1 file(s)
GSE4284_series_matrix.txt.gz
trying URL 
'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE4284/GSE4284_series_matrix.txt.gz'
ftp data connection made, file length 4137302 bytes
opened URL
==================================================
downloaded 3.9 Mb

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, 
na.strings,  :
   scan() expected 'a real', got '"Schizosaccharomycespombe"'
In addition: Warning message:
In grep("^!Sample_", a, ignore.case = TRUE) :
   input string 1 is invalid in this locale
 > sessionInfo()
R version 2.7.1 (2008-06-23)
x86_64-pc-linux-gnu

locale:
LC_CTYPE=en_AU.UTF-8;LC_NUMERIC=C;LC_TIME=en_AU.UTF-8;LC_COLLATE=en_AU.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_AU.UTF-8;LC_PAPER=en_AU.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_AU.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] GEOquery_2.4.1 RCurl_0.9-4    Biobase_2.0.1


-- 
Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham



More information about the Bioconductor mailing list