[BioC] NCBI gff3 annotation file and read.gff()

Chris Stubben stubben at lanl.gov
Wed Jul 16 17:58:04 CEST 2014


I would also suggest using rtracklayer import or create a genome data 
package.   At least for microbial genomes, you often just need to return 
features (CDS, pseudogenes, tRNAs, etc) that have a parent with a 
locus_tag key and assign that locus tag to the child (the read.gff 
default), so that's what is getting messed up with your large file.   

I'll probably use the rtracklayer import object in future versions 
instead and then join on Parent where locus_tag is NA to the ID where 
locus_tag is not NA. 

Chris


>I cc'd the packageMaintainer(), so that they are more likely to see this post.

>I don't know whether this helps in this particular case, but packages should be 
>using rtracklayer::import rather than creating their own readers. Then at least 
>whatever deficiencies are identified and corrected benefit the entire project.




-- 

Chris Stubben

Los Alamos National Lab
Bioscience Division
MS M888
Los Alamos, NM 87545

Phone: (505) 667-3295



More information about the Bioconductor mailing list