[BioC] rtracklayer proposal for ISSUE: import.gff3 asRangedData=FALSE fails when strand is '.'

Cook, Malcolm MEC at stowers.org
Wed Apr 18 18:04:58 CEST 2012


Hi, rtracklayerers,

import.gff3 with asRangedData=TRUE passes a period through to the strand of imported RangedData, however, calling it with asRangedData=FALSE errors:

> gff.str<-"2L\tFlyBase\tgene\t7529\t9484\t0\t.\t0\tID=FBgn0031208;Name=CG11023"
> import.gff3(textConnection(gff.str),asRangedData=TRUE)
RangedData with 1 row and 7 value columns across 1 space
     space       ranges |     type   source    phase   strand          ID        Name     score
  <factor>    <IRanges> | <factor> <factor> <factor> <factor> <character> <character> <numeric>
1       2L [7529, 9484] |     gene  FlyBase        0       NA FBgn0031208     CG11023         0
> import.gff3(textConnection(gff.str),asRangedData=FALSE)
Error in strand(runValue(strand)) : strand values must be in '+' '-' '*'

The GFF3 spec allows '.' (and '?') to appear as value of strand:

Column 7: "strand"
The strand of the feature.  + for positive strand (relative to the
landmark), - for minus strand, and . for features that are not
stranded.  In addition, ? can be used for features whose strandedness
is relevant, but unknown.

Arguably, import.gff{,2,3} should provide some control over interpretation of '.' and '?' appearing in the strand column, allowing it to comport with strand and GRanges

I propose the following as an intended backwards compatible fix.

New argument to import.gff{,2,3}

 strandMap: control for mapping out-of-band values  (FALSE,TRUE,a string, a list), understood as follows
	FALSE: the default - do not  map out of band values to '*'
	TRUE:  map all out of band values to '*' 
	any 0 length character vector: map out of band values to it (presumably it will be one of '*', '-','+'
	a list: lookup how to map out of band values in the list by name.

If it is agreed that this is the best resolution, and the rtracklayer gods wish it, I will take this as my first opportunity to contribute and will follow-up accordingly....

Else?

Cheers,

Malcolm



More information about the Bioconductor mailing list