[BioC] rtracklayer gff import

Hans-Rudolf Hotz hrh at fmi.ch
Fri Apr 15 09:29:49 CEST 2011






On 04/15/2011 12:14 AM, Cook, Malcolm wrote:
>
>> rtracklayer currently considers GFF3 files to be right-open.
>> The GFF3 spec
>> states that start is always<= end, and that zero-width
>> intervals have start
>> == end.
>
> yes but 1 width intervals also have start = end
>
>> To me, this suggests that they are right-open.
>> Otherwise, you need
>> some other way to distinguish zero vs. one width intervals,
>> which is crazy.
>
> yes - it is crazy

it might be 'crazy'....but it has been always like this:

GFF (and its extensions like gtf or gff3 ) are "end inclusive" (or right 
closed), see:

http://www.sanger.ac.uk/resources/software/gff/spec.html
http://genome.ucsc.edu/FAQ/FAQformat.html#format3
http://genome.ucsc.edu/FAQ/FAQformat.html#format4

and

http://www.sequenceontology.org/gff3.shtml


and the latest GFF3 definition explains very well how to treat 
:zero-length features:

    "For zero-length features, such as insertion sites, start equals end
     and the implied site is to the right of the indicated base in the
     direction of the landmark."

yes, as a consequence, you have to pay attention to the 'value' of the 
third column to figure out whether this could be a zero-length feature. 
But in practice, this has always been obvious to me. Also, I hardly work 
with GFF/GTF/GFF3 files which have different kind of features, I usually 
split by the third column an then treat each feature according to its 
meaning.


My two cents....


Regards, Hans



>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list