[BioC] Using Biomart other than Ensembl

James W. MacDonald jmacdon at uw.edu
Wed Feb 26 21:29:43 CET 2014


Hi Daniela,

Please don't take things off-list (e.g., use Reply-all).

On 2/26/2014 3:20 PM, Daniela Moré wrote:
> Hi Jim,
>
> Actually, this first step will give me the read counts through 
> summarizeOverlaps before the DE analysis.
>
> More specifically, I'm choosing a gene model to make a transcriptDb 
> using makeTranscriptDbFromBiomart (page 7)
> I'm following the attached documentation available during the last 
> Bioconductor summer course in Brazil (to which the page number refers)

If you prefer NCBI identifiers, you can use makeTranscriptDbFromUCSC() 
instead.

library(GenomicFeatures)
tx <- makeTranscriptDbFromUCSC("bosTau6", "refGene")

Should do the trick.


Best,

Jim


>
> Thank you in advance
>
> Daniela
>
>
> On Wed, Feb 26, 2014 at 4:55 PM, James W. MacDonald <jmacdon at uw.edu 
> <mailto:jmacdon at uw.edu>> wrote:
>
>     Hi Daniela,
>
>     On 2/26/2014 2:29 PM, Daniela Moré [guest] wrote:
>
>         Hi guys,
>         I'm new on R and Bioconductor packages so my question can
>         sounds a little basics but I really could not figure out how
>         to use a database from NCBI in BiomaRt.
>         I'm working on RNA-Seq reads to perform DE analysis and I'm
>         interested in Bos taurus database from NCBI version UMD3.1.
>
>
>     I think you will need to give more information here. What exactly
>     are you trying to do? Have you already done the DE analysis, and
>     now are simply trying to annotate the results? If so, what type of
>     gene/transcript IDs do you have?
>
>     Best,
>
>     Jim
>
>
>
>         So my question is: how to choose the bovine UMD3.1 from NCBI
>         in BiomaRt? Or the best way to solve this would be to perform
>         the aligment using the ensembl version?
>
>         Just to make me clear I can't find any NCBI databases when I type:
>
>             library("biomaRt")
>             listMarts()
>
>         If I take a look at “ensembl†[ensembl=useMart("ensembl")]
>         so I can see the btaurus_gene_ensembl dataset. However, as I
>         aligned my reads against a NCBI version when I tried count the
>         reads, it did not work ('cause they have different identifiers
>         I guess). The manual shows a short example using a wormDb but
>         it did not help so much.
>
>         -- output of sessionInfo():
>
>         R version 3.0.2 (2013-09-25)
>         Platform: x86_64-redhat-linux-gnu (64-bit)
>
>         locale:
>         [1] C
>
>         attached base packages:
>         [1] parallel stats graphics grDevices utils datasets methods base
>
>         other attached packages:
>         [1] DESeq2_1.2.10 RcppArmadillo_0.4.000.2 Rcpp_0.11.0
>         Rsamtools_1.14.3 Biostrings_2.30.1 GenomicRanges_1.14.4
>         [7] XVector_0.2.0 IRanges_1.20.6 BiocGenerics_0.8.0
>
>         loaded via a namespace (and not attached):
>         [1] AnnotationDbi_1.24.0 BSgenome_1.30.0 Biobase_2.22.0
>         DBI_0.2-7 GenomicFeatures_1.14.2 RColorBrewer_1.0-5
>         [7] RCurl_1.95-4.1 RSQLite_0.11.4 XML_3.98-1.1 annotate_1.40.0
>         biomaRt_2.18.0 bitops_1.0-6
>         [13] genefilter_1.44.0 grid_3.0.2 lattice_0.20-24
>         locfit_1.5-9.1 rtracklayer_1.22.3 splines_3.0.2
>         [19] stats4_3.0.2 survival_2.37-7 tools_3.0.2 xtable_1.7-1
>         zlibbioc_1.8.0
>
>         --
>         Sent via the guest posting facility at bioconductor.org
>         <http://bioconductor.org>.
>
>         _______________________________________________
>         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>     -- 
>     James W. MacDonald, M.S.
>     Biostatistician
>     University of Washington
>     Environmental and Occupational Health Sciences
>     4225 Roosevelt Way NE, # 100
>     Seattle WA 98105-6099
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list