[BioC] GRanges setdiff With Features on Both Strands

Cook, Malcolm MEC at stowers.org
Mon Jan 23 16:12:34 CET 2012


> > As for the weird behavior you get with ignore.strand=TRUE, I agree
> > it doesn't make much sense and it's not totally clear to me what the
> > correct behavior should be. Sadly the man page doesn't say anything
> > about what the intended behavior is (there is only a brief note about
> > what ignore.strand=TRUE does for parallel set operations), gives no
> > example, and this case is not covered by the unit tests either.
> >
> > However, by looking at what union() and gaps() do with
> > ignore.strand=TRUE, and also at the internals of the GenomicRanges
> > package, it seems to me that the intention was that:
> >
> >  union(x, y, ignore.strand=TRUE)
> >  intersect(x, y, ignore.strand=TRUE)
> >  setdiff(x, y, ignore.strand=TRUE)
> >
> > would be equivalent to:
> >
> >  union(`strand<-`(x, "+"), `strand<-`(y, "+"))
> >  intersect(`strand<-`(x, "+"), `strand<-`(y, "+"))
> >  setdiff(`strand<-`(x, "+"), `strand<-`(y, "+"))
> >
> > So I could fix setdiff() and intersect() to do this (intersect() has
> > a weird behavior too), and update the documentation. Does that sound
> > reasonable?
> >
> >
> This would be reasonable to me.

I think that retaining `ignore.strand` with the proposed semantics of (potentially) altering the strand is inviting further confusion.

For example, it would introduce the possibility that the union of x with itself is not all.equal to itself only in the case when the strand of x is not '+'.

I think this is a markedly poor outcome since `ignore.strand` is clearly NOT what the operation would be doing.

> > Finally I'd like to mention that I see very little value in supporting
> > the 'ignore.strand' arg in those 3 methods so another option would be
> > to just get rid of it.
> >
> >
> I think there is value here. Often, one wants to ignore the strand in these
> operations (say between reads and transcripts) without destroying the
> strand information. Creating temporary objects is a nuisance.

Perhaps a compromise then.....

Since the implementation appears to have been undocumented and the intention unclear, how about replace `ignore.strand` option with a `with.strand=c('+','-','*')` option which sets the GRanges strand prior to performing the operation.



More information about the Bioconductor mailing list