[BioC] makeTranscriptDbFromGFF fails on NCBI Bacteria genomes

Thomas Girke thomas.girke at ucr.edu
Fri Jun 7 19:52:23 CEST 2013

It seems to me that makeTranscriptDbFromGFF does not yet work on the 
bacteria GFFs from NCBI (perhaps others too): 

## For instance, the following 
download.file("ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Mycoplasma_arthritidis_158L3_1_uid58005/NC_011025.gff", destfile="NC_011025.gff")
txdb <- makeTranscriptDbFromGFF(file="NC_011025.gff", format="gff3", dataSource="NCBI", species="Some bact")

## returns this error:
extracting transcript information
Error in .prepareGFF3TXS(data) :
  No Transcript information present in gff file

I guess this is because in bacteria GFF we don't have explicit
transcript annotations. There are hacks around this problem, but it
would be nice if this could be supported in the future right out of the
box. I apologize if I missed an existing solution for this.



> sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: x86_64-unknown-linux-gnu (64-bit)

[1] C

attached base packages:
[1] parallel  stats     graphics  utils     datasets  grDevices methods   base

other attached packages:
[1] GenomicFeatures_1.12.1 AnnotationDbi_1.22.0   Biobase_2.20.0         rtracklayer_1.20.1     GenomicRanges_1.12.0   IRanges_1.18.0         BiocGenerics_0.6.0

loaded via a namespace (and not attached):
 [1] BSgenome_1.28.0   Biostrings_2.28.0 DBI_0.2-5         RCurl_1.95-4.1    RSQLite_0.11.2    Rsamtools_1.12.0  XML_3.96-1.1      biomaRt_2.16.0    bitops_1.0-5      stats4_3.0.0      tools_3.0.0       zlibbioc_1.6.0

More information about the Bioconductor mailing list