[BioC] GDS2eSet problems

Martin Morgan mtmorgan at fhcrc.org
Thu Aug 9 01:53:38 CEST 2007


Wyatt --

Not that I have a solution, but if I

> debug(GDS2eSet)
> GDS2eSet(gds961)

and then step through the function (pressing the 'n' key) until just
before the line "new('ExpressionSet', ...)", and then take a look at
the row names of the expression data and of the feature data, I find

Browse[1]> which(rownames(expr) != rownames(pData(featuredata)))
 [1] 12626 12627 12628 12629 12630 12631 12632 12633 12634 12635 12636 12637
[13] 12638 12639 12640 12641 12642 12643 12644 12645 12646 12647 12648
Browse[1]> rownames(expr)[bad]
 [1] "788_s_at"  "119_at"    "1215_at"   "1216_at"   "124_i_at"  "125_r_at" 
 [7] "127_at"    "1301_s_at" "1302_s_at" "132_at"    "1429_at"   "1502_s_at"
[13] "1829_at"   "1864_at"   "1889_s_at" "1982_s_at" "36969_at"  "383_at"   
[19] "397_at"    "412_s_at"  "426_at"    "439_at"    "787_at"   
Browse[1]> rownames(pData(featuredata))[bad]
 [1] "119_at"    "1215_at"   "1216_at"   "124_i_at"  "125_r_at"  "127_at"   
 [7] "1301_s_at" "1302_s_at" "132_at"    "1429_at"   "1502_s_at" "1829_at"  
[13] "1864_at"   "1889_s_at" "1982_s_at" "36969_at"  "383_at"    "397_at"   
[19] "412_s_at"  "426_at"    "439_at"    "787_at"    "788_s_at" 

Position 12626 seems to be excised and re-inserted at 12648. So the
problem is somewhere before the creation of ExpressionSet -- it could
be that GDS2eSet needs to sort the rows of featureData to match the
order of exprs, or that there is actually an error in the data coming
from GEO. The package maintainer (copied on this email; use
packageDescription to get this information) will likely chime in.

This is in R-2.5.0, like you, or in R-devel, as follows

> sessionInfo()
R version 2.6.0 Under development (unstable) (2007-08-08 r42441) 
x86_64-unknown-linux-gnu 

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] GEOquery_2.1.7  RCurl_0.8-1     R.utils_0.9.5   R.oo_1.2.7     
[5] Biobase_1.15.23

Martin 

"Mcmahon, Wyatt" <wyatt.mcmahon at ttu.edu> writes:

> Hello group,
>  
> I know this issue has risen before, but it appears that there was not a final answer to the problem (at least of the latest incident dating June 2007). 
>  
> Similar to others, I am having a difficult time with GDS2eSet.  After downloading GDS961 using 
>  
>>gds961<-getGEO("GDS961")
>  
> I then tried to convert to exprSet using GDS2eSet, and get the following output:
>  
>> eset961<-GDS2eSet(gds961)
> trying URL 'http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?targ=self&acc=GPL91&form=text&view=full'
> Content type 'geo/text' length unknown
> opened URL
> downloaded 13370Kb
> File stored at: 
> ...
> Error in validObject(.Object) : invalid class "ExpressionSet" object: featureNames differ between assayData and featureData
>
> Here is my session info:
>> sessionInfo()
> R version 2.5.1 (2007-06-27) 
> i386-pc-mingw32 
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> attached base packages:
> [1] "tools"     "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"   "base"     
> other attached packages:
>     affy   affyio GEOquery  Biobase    limma 
> "1.14.2"  "1.4.1"  "2.0.5" "1.14.1" "2.10.5" 
>
> Thanks in advance,
>  
> Wyatt
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org



More information about the Bioconductor mailing list