[BioC] problems with strand in predictCoding

Jeremiah Degenhardt degenhardt.jeremiah at gene.com
Fri Apr 20 17:58:58 CEST 2012


>
>
> There is an "ignore.strand" argument to findOverlaps, so we have a switch. I
> have always thought that strand should be ignored by default in operations
> like overlap detection, and only considered as a "direction" rather than as
> separate in space. It's very useful for resize() and flank() to consider
> strand, but not so useful for findOverlaps. The ignore.strand=FALSE in those
> cases default would qualify for the eight circle if there were a Bioc
> Inferno book. It's only the default that I argue with though, having the
> capability to consider strand is useful.

I had forgotten about the ignore.strand option, thanks for the
reminder Michael. So, given that it's there I agree with you fully. It
seems he default should be changed to TRUE for the Overlap functions
and the precedes and follows as well.

Note however, that this would not fully correct the issue in the
predictCoding function as the function still needs to correctly
reverse complement the varAllele to get the annotation correct.

As a further note on how big of an issue this is, if you go to the
BioC home page and look at the tutorial on "Using Bioconductor to
annotate genetic variants" you will find that the example makes this
exact mistake. The variants in the VCF are unstranded and two of the
genes in the example are negative strand and one is positive.
Following the code you will get incorrect annotations for all variants
on the negative strand genes.

Jeremiah



-- 
Jeremiah Degenhardt, Ph.D.
Computational Biologist
Bioinformatics and Computational Biology
Genentech, Inc.
degenhardt.jeremiah at gene.com



More information about the Bioconductor mailing list