[BioC] getGEO invalid multibyte string

Martin Morgan mtmorgan at fhcrc.org
Fri Aug 31 17:48:31 CEST 2007


Hi Tobias --

Very clever fix! I think the problem would also be solved by setting
your 'locale' to a non-UTF-8., e.g.,

> Sys.setlocale(locale="C")

(you can revert to previous settings with locale="en_US.UTF-8")

I think also that there is a little mis-match between how clever R is
about locales, and how clever developers like us are about locales --
I would recommend, at least at the current moment in time, setting the
locale to "C" unless UTF-8 encoding is needed.

Martin

Tobias Straub <tstraub at med.uni-muenchen.de> writes:

> ok, I figured it out myself! modified GEOquery
>
> *** GEOquery    Fri Aug 10 08:24:53 2007
> --- GEOquery_new     Fri Aug 31 14:43:15 2007
> ***************
> *** 495,500 ****
> --- 495,501 ----
>      nextEntity <- ""
>      while(!finished) {
>        line <- readLines(con,1)
> +     line <- iconv(line, "LATIN2", "UTF-8")
>        if(length(line)==0) finished <- TRUE
>        a[lines] <- line
>        lines <- lines+1
> ***************
> *** 510,515 ****
> --- 511,517 ----
>      finished <- FALSE
>      while(!finished) {
>        line <- readLines(con,1)
> +     line <- iconv(line, "LATIN2", "UTF-8")
>        if(length(line)==0) {
>          finished <- TRUE
>        } else {
>
>
> On Aug 31, 2007, at 2:17 PM, Tobias Straub wrote:
>
>> I tried to fetch and parse the GSE94 series from GEO using GEOquery
>> library (gse<-getGEO('GSE94')). Operation is aborted with the message:
>>
>> Error in make.names(as.character(names), allow_) :
>> 	invalid multibyte string 29
>> In addition: There were 12 warnings (use warnings() to see them)
>>
>> is that an error of getGEO or a problem of the data set?
>>
>> Tobias
>>
>>> sessionInfo()
>> R version 2.5.1 (2007-06-27)
>> i386-apple-darwin8.9.1
>>
>> locale:
>> en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] "stats"     "graphics"  "grDevices" "utils"     "datasets"
>> "methods"   "base"
>>
>> other attached packages:
>> GEOquery
>> "2.0.6"
>>
>>
>> ======================================================================
>> Dr. Tobias Straub         Adolf-Butenandt-Institute, Molecular Biology
>> tel: +49-89-2180 75 439         Schillerstr. 44, 80336 Munich, Germany
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/ 
>> gmane.science.biology.informatics.conductor
>
> ======================================================================
> Dr. Tobias Straub         Adolf-Butenandt-Institute, Molecular Biology
> tel: +49-89-2180 75 439         Schillerstr. 44, 80336 Munich, Germany
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org



More information about the Bioconductor mailing list