[BioC] rtracklayer: import.gff seems to be very slow

Michael Dondrup Michael.Dondrup at uni.no
Fri Oct 15 11:40:36 CEST 2010


I am trying to read in a genome annotation from a GFF3 file from NCBI [1]
The file is about 7.5 MB and has ~17000 non-comment lines. While I can read the file
with read.delim in less than a second, trying 
bsub = import.gff("~/Downloads/bsubtilis.gff")
is very slow. I would rather like to use a standardized function form the package
that understands various formats, but currently I cannot use it for whole genome 
annotation. Could this be improved, or is the fie format incorrect?


[1]: ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/Bacillus_subtilis/AL009126.gff

> sessionInfo()R version 2.11.1 (2010-05-31) 

[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rtracklayer_1.8.1 RCurl_1.4-2       bitops_1.0-4.1   

loaded via a namespace (and not attached):
[1] Biobase_2.8.0       Biostrings_2.16.0   BSgenome_1.16.1    
[4] GenomicRanges_1.0.9 IRanges_1.6.6       XML_3.1-0          

More information about the Bioconductor mailing list