[BioC] Error using Homo.sapiens AnnotationDbi package with GenomicFeatures

Marc Carlson mcarlson at fhcrc.org
Tue Nov 13 19:19:09 CET 2012


Hi Chris,

I also noticed that in your select query from before "ENTREZID" was not 
coming back properly.  This has now been fixed.  So (after a quick 
update) you can also do this for the last step:

res2<- select(Homo.sapiens, keys=k, cols=c("ENTREZID","TXNAME"), 
keytype="TXNAME")
head(res2)

   Marc


On 11/08/2012 04:44 PM, Marc Carlson wrote:
> Hi Chris,
>
> If you load the Homo.sapiens package, you will see it load the 
> TxDb.Hsapiens.UCSC.hg19.knownGene package for you as a dependency.  So 
> you don't need to call makeTranscriptDbFromUCSC(), at least not for 
> the track you were going for, because that was already loaded via the 
> TxDb.Hsapiens.UCSC.hg19.knownGene package.  To get the promoter 
> regions, you really only need to call promoters like this:
>
> library(Homo.sapiens)
> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
> proms <- promoters(txdb, upstream=2000,  downstream=200)  ## check the 
> defaults in case you don't like them!
> proms
>
> ## Once you have the promoters, you can look up the tx_names for these 
> like this.
> k <- proms$tx_name
>
> ## And then you can use select to retrieve the matching gene IDs
> ## In the case of Homo.sapiens, the gene IDs actually *are* entrez 
> gene IDs (because that is what the knownGene track is using as a gene 
> ID).
> res <- select(Homo.sapiens, keys=k, cols=c("GENEID","TXNAME"), 
> keytype="TXNAME")
> head(res)
>
>
>
>   Marc
>
>
>
>
> On 11/08/2012 10:54 AM, Chris Whelan wrote:
>> Hi,
>>
>> I'm having trouble using the AnnotationDbi package and was wondering
>> if someone could tell me what I'm doing wrong. I'm trying to use
>> GenomicFeatures to find promoter regions and then use AnnotationDbi to
>> look up the Entrez Gene IDs for those transcripts, but getting an
>> error. If I'm going about this all wrong let me know; I find it a
>> little difficult to follow the thread of the documentation of the
>> various feature/annotation packages. At the very least the error
>> message that I'm getting seems like it might be a little friendlier?
>>
>> Thanks!
>>
>> Chris
>>
>> Bioconductor version 2.11 (BiocInstaller 1.8.3), ?biocLite for help
>>> library(GenomicFeatures)
>> Loading required package: BiocGenerics
>>
>> Attaching package: 'BiocGenerics'
>>
>> The following object(s) are masked from 'package:stats':
>>
>>      xtabs
>>
>> The following object(s) are masked from 'package:base':
>>
>>      Filter, Find, Map, Position, Reduce, anyDuplicated, cbind,
>>      colnames, duplicated, eval, get, intersect, lapply, mapply, mget,
>>      order, paste, pmax, pmax.int, pmin, pmin.int, rbind, rep.int,
>>      rownames, sapply, setdiff, table, tapply, union, unique
>>
>> Loading required package: IRanges
>> Loading required package: GenomicRanges
>> Loading required package: AnnotationDbi
>> Loading required package: Biobase
>> Welcome to Bioconductor
>>
>>      Vignettes contain introductory material; view with
>>      'browseVignettes()'. To cite Bioconductor, see
>>      'citation("Biobase")', and for packages 'citation("pkgname")'.
>>
>> li>  library(Homo.sapiens)
>> Loading required package: OrganismDbi
>> Loading required package: GO.db
>> Loading required package: DBI
>>
>> Loading required package: org.Hs.eg.db
>>
>> Loading required package: TxDb.Hsapiens.UCSC.hg19.knownGene
>>> hg19UCSCGenes<- makeTranscriptDbFromUCSC(genome = "hg19", tablename 
>>> = "knownGene")
>> Download the knownGene table ... OK
>> Download the knownToLocusLink table ... OK
>> Extract the 'transcripts' data frame ... OK
>> Extract the 'splicings' data frame ... OK
>> Download and preprocess the 'chrominfo' data frame ... OK
>> Prepare the 'metadata' data frame ... metadata: OK
>>> k<- elementMetadata(head(promoters(hg19UCSCGenes)))[,"tx_name"]
>> Warning messages:
>> 1: In `start<-`(`*tmp*`, value = c(9874, 9874, 9874, 67091, 319084,  :
>>    trimmed start values to be positive
>> 2: In `end<-`(`*tmp*`, value = c(12073, 12073, 12073, 69290, 321283,  :
>>    trimmed end values to be<= seqlengths
>>> k
>> [1] "uc001aaa.3" "uc010nxq.1" "uc010nxr.1" "uc001aal.1" "uc001aaq.2"
>> [6] "uc001aar.2"
>>> head(keys(Homo.sapiens, keytype="TXNAME"))
>> [1] "uc001aaa.3" "uc010nxq.1" "uc010nxr.1" "uc001aal.1" "uc001aaq.2"
>> [6] "uc001aar.2"
>>> select(Homo.sapiens, keys=k, keytype="TXNAME", cols=c("TXNAME", 
>>> "ENTREZID")
>> + )
>> Error in if (nrow(res)>  0L) { : argument is of length zero
>>> sessionInfo()
>> R version 2.15.1 (2012-06-22)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>> [1] C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>>   [1] Homo.sapiens_1.0.0
>>   [2] TxDb.Hsapiens.UCSC.hg19.knownGene_2.8.0
>>   [3] org.Hs.eg.db_2.8.0
>>   [4] GO.db_2.8.0
>>   [5] RSQLite_0.11.2
>>   [6] DBI_0.2-5
>>   [7] OrganismDbi_1.0.0
>>   [8] GenomicFeatures_1.10.0
>>   [9] AnnotationDbi_1.20.2
>> [10] Biobase_2.18.0
>> [11] GenomicRanges_1.10.4
>> [12] IRanges_1.16.4
>> [13] BiocGenerics_0.4.0
>> [14] BiocInstaller_1.8.3
>>
>> loaded via a namespace (and not attached):
>>   [1] BSgenome_1.26.1    Biostrings_2.26.2  RBGL_1.34.0        
>> RCurl_1.95-3
>>   [5] Rsamtools_1.10.1   XML_3.95-0.1       biomaRt_2.14.0     
>> bitops_1.0-4.2
>>   [9] graph_1.36.0       parallel_2.15.1    rtracklayer_1.18.0 
>> stats4_2.15.1
>> [13] tools_2.15.1       zlibbioc_1.4.0
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list