[BioC] bug in Biostrings mismatchTable?

Hervé Pagès hpages at fhcrc.org
Fri Oct 12 09:45:36 CEST 2012


Hi Janet,

Thanks again for the bug report. This one should be fixed in Biostrings
2.26.2 (release) and 2.27.3 (devel).

Cheers,
H.

On 10/10/2012 05:13 PM, Janet Young wrote:
> Hi there,
>
> I think I've found a bug in mismatchTable (Biostrings).  It's reporting a mismatch after the end of the reported alignment.  I think the code below shows the problem.
>
> thanks, as usual!
>
> Janet
>
> #####
>
> library(Biostrings)
>
> ### couple of seqs, the middle portion aligns, but the last few bases don't. I'm not interested in those last few bases, so I do a local alignment
> seq1 <- DNAString("GCTGAAGTAGTTCTCCAGAA")
> seq2 <-       DNAString("GTAGTTCTCCAAAGT")
> aln1 <- pairwiseAlignment ( seq1, seq2, type="local" )
> aln1
> # Local PairwiseAlignmentsSingleSubject (1 of 1)
> # pattern: [7] GTAGTTCTCCA
> # subject: [1] GTAGTTCTCCA
> # score: 21.79932
>
> end(pattern(aln1))
> # [1] 17
>
> mismatchTable(aln1)
> #  PatternId PatternStart PatternEnd PatternSubstring PatternQuality
> #1         1           18         18                G              7
> #  SubjectStart SubjectEnd SubjectSubstring SubjectQuality
> #1           12         12                A              7
> #### the one mismatch that's reported is after the end of the alignment as reported above.  There's another mismatch after the end of the alignment that wasn't reported
>
> sessionInfo()
>
> R Under development (unstable) (2012-10-03 r60868)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] Biostrings_2.27.2  IRanges_1.17.0     BiocGenerics_0.5.0
>
> loaded via a namespace (and not attached):
> [1] parallel_2.16.0 stats4_2.16.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list