[BioC] Putative bug in Biostrings

Watal M. Iwasaki heavy.watal at gmail.com
Thu Jul 17 07:21:41 CEST 2014


Thanks Hervé,

Using pairwiseAlignment() instead of PairwiseAlignments() solved the
problem. Sorry that I didn't read the documentation thoroughly and
bothered you. Aside from refactoring in those functions and the
documentation, it could be reasonable to exclude such low-level (and
confusable) utility from NAMESPACE. Anyway, thank you again for your
help.

Best,
Watal

On Wed, Jul 16, 2014 at 11:17 AM, Hervé Pagès <hpages at fhcrc.org> wrote:
> Hi Watal,
>
> Hope you don't mind that I'm cc'ing the Bioconductor mailing list.
>
>
> On 07/10/2014 10:50 PM, Watal M. Iwasaki wrote:
>>
>> Dear Mr. Pages,
>>
>> I encountered a strange behavior in Biostrings::mismatchTable(). It
>> seems to throw an error when the PairwiseAlignments argument contains
>> the sequences that differ in the last two consecutive nucleotides. For
>> example,
>>
>>> mismatchTable(PairwiseAlignments('ATGC', 'ATAT'))
>>
>> Error in fromXStringViewsToStringSet(x, out.of.limits = out.of.limits,  :
>>    'x' has "out of limits" views
>
>
> You are calling the PairwiseAlignments() constructor function but I
> suspect that what you really want is the pairwiseAlignment() function
> to perform the alignment. My understanding is that the
> PairwiseAlignments() constructor function is a low-level utility
> that is almost never intended to be used directly. Always use
> pairwiseAlignment():
>
>   > pa <- pairwiseAlignment('ATGC', 'ATAT')
>
>   > writePairwiseAlignments(pa)
>   ########################################
>   # Program: Biostrings (version 2.33.13), a Bioconductor package
>   # Rundate: Tue Jul 15 19:09:10 2014
>   ########################################
>   #=======================================
>   #
>   # Aligned_sequences: 2
>   # 1: P1
>   # 2: S1
>   # Matrix: NA
>   # Gap_penalty: 14.0
>   # Extend_penalty: 4.0
>   #
>   # Length: 4
>   # Identity:       2/4 (50.0%)
>   # Similarity:    NA/4 (NA%)
>   # Gaps:           0/4 (0.0%)
>   # Score: -7.835059
>   #
>   #
>   #=======================================
>
>   P1                 1 ATGC      4
>                        ||
>   S1                 1 ATAT      4
>
>
>   #---------------------------------------
>   #---------------------------------------
>
>   > mismatchTable(pa)
>     PatternId PatternStart PatternEnd PatternSubstring PatternQuality
>   1         1            3          3                G              7
>   2         1            4          4                C              7
>     SubjectStart SubjectEnd SubjectSubstring SubjectQuality
>   1            3          3                A              7
>   2            4          4                T              7
>
> The documentation could certainly be a little bit clearer about this.
>
> One could fairly argue that the PairwiseAlignments() constructor
> should not return invalid PairwiseAlignments objects (which is the
> reason why mismatchTable() then choke on it) but this is only one
> of the many things that would need some refactoring in the
> pairwiseAlignment/PairwiseAlignments code (this code is big and
> complex) and unfortunately I can't dedicate time to this at the
> moment.
>
> Hope this helps though.
>
> Cheers,
> H.
>
>>
>> Thanks,
>> Watal
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319



-- 
Watal M. Iwasaki / 岩嵜 航

Graduate University for Advanced Studies,
Hayama, Kanagawa 240-0193, Japan
+81-46-858-1576
heavy.watal at gmail.com
http://meme.biology.tohoku.ac.jp/students/iwasaki/



More information about the Bioconductor mailing list