[BioC] biomaRt package: problem with useDataset() function

Hans-Rudolf Hotz hrh at fmi.ch
Fri Jun 24 18:41:53 CEST 2011


Hi Avril

"EB 1 m_ulcerans" looks like the 'version' of the dataset to me.

check:

 > listDatasets(ensemblbacteria)[82,]
           dataset                                    description
82 myc_25870_gene Mycobacterium ulcerans genes (EB 1 m_ulcerans)
            version
82 EB 1 m_ulcerans
 >

Hence, if I do:

 > ensemblMulcerans <- useDataset("myc_25870_gene",mart=ensemblbacteria)
 >

I can do stuff like:

 > getBM(attributes=c("ensembl_transcript_id", "transcript_start"), 
filters=c("chromosome_name", "start","end"), values=list("pMUM001", "1", 
"10000"),ensemblMulcerans)
    ensembl_transcript_id transcript_start
1      EBMYCT00000077424             6612
2      EBMYCT00000077394             1694
3      EBMYCT00000077445             7630
4      EBMYCT00000077427             6383
5      EBMYCT00000077446             8430
6      EBMYCT00000077396             2921
7      EBMYCT00000077437             7188
8      EBMYCT00000077402             5640
9      EBMYCT00000077412                1
10     EBMYCT00000077408             2310
11     EBMYCT00000077390             1117
 >


I hope this helps
Regards, Hans

 > sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] biomaRt_2.8.0

loaded via a namespace (and not attached):
[1] RCurl_1.5-0 XML_3.2-0
 >




On 06/24/2011 06:05 PM, Coghlan, Avril wrote:
> Dear all,
>
> I am trying to use the biomaRt package to retrieve data from the Ensembl
> database.
> However, I get an error message using the useDataset() function.
> I think this is perhaps because useDataset() does not expect Ensembl
> datasets to have spaces in their names, but some do.
>
> I have typed the following to try to select the Mycobacterium ulcerans
> dataset from the Ensembl Bacteria database:
>     library("biomaRt")
>     ensemblbacteria<- useMart("bacterial_mart_9")
>     listDatasets(ensemblbacteria)
> This tells me that the name of the Mycobacterium ulcerans data set is
> "EB 1 m_ulcerans".
>
> However, when I try to select it using useDataset() I get an error
> message:
>     ensemblMulcerans<- useDataset("EB 1
> m_ulcerans",mart=ensemblbacteria)
>
> Error in useDataset("EB 1 m_ulcerans", mart = ensemblbacteria) :
>    The given dataset:  EB 1 m_ulcerans , is not valid.  Correct dataset
> names can be obtained with the listDatasets function.
>
> I think this could be because useDataset does not expect dataset names
> to contain spaces?
>
>
> I would be grateful for any help with this.
>
> Kind Regards,
> Avril
>
> Avril Coghlan
> Cork, Ireland
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list