[BioC] obtaining a complete global alignment via pairwiseAlignment

Martin Morgan mtmorgan at fhcrc.org
Sat Jun 15 19:19:10 CEST 2013


On 06/15/2013 09:18 AM, Robert K. Bradley wrote:
> Thanks Martin.  as.character (aligned (...)) does return a gapped string for one
> of the two aligned sequences, but even that command trims off unaligned
> characters.  For example:
> For example:
>  > s1 = DNAString ("AGGGG")
>  > s2 = DNAString ("GGGGA")
>  > xx = pairwiseAlignment (s1, s2, gapOpening = -1)
>  > as.character (xx)
> [1] "GGGG-"
> while the whole gapped string is "AGGGG-".
>
> I apologize in advance if I'm missing something obvious here.

Not really doing much better on my end; I'll offer this up then step aside until 
someone with more expertise chimes in. Maybe this is closer to what you're 
after, padding aligned(xx) with and left or right overhang of pattern

     lraligned <- function(pattern, subject, ...) {
         xx <- pairwiseAlignment (pattern, subject, gapOpening = -1)
         aln <- aligned(xx)
         lpad <- substr(pattern, 1L, start(pattern(xx)) - 1L)
         rpad <- substr(pattern, end(pattern(xx)) + 1L, nchar(pattern))
         paste0(lpad, aln, rpad)
     }

 > lraligned(DNAString ("AGGGG"), DNAString ("GGGGA"), type="global")
[1] "AGGGG-"
 > lraligned(DNAString ("GGGGA"), DNAString ("AGGGG"), type="global")
[1] "-GGGGA"
 > lraligned(DNAString("AAGGAA"), DNAString("GG"))
[1] "AAGGAA"
 > lraligned(DNAString("GG"), DNAString("AAGGAA"))
[1] "--GG--"

Martin

>
> Best,
> Rob
>
> On 6/15/13 6:45 AM, Martin Morgan wrote:
>> On 06/14/2013 07:57 PM, Robert K.Bradley [guest] wrote:
>>>
>>> Hello,
>>>
>>> I'm interested in using the pairwiseAlignment function in Biostrings
>>> to perform simple alignments of short sequences.  I noticed that even
>>> the global alignment mode returns an alignment with gapped ends
>>> trimmed off.  For example:
>>>
>>>> s1 = DNAString ("AAGGAA")
>>>> s2 = DNAString ("GG")
>>>> pairwiseAlignment (s1, s2, type = "global")
>>> Global PairwiseAlignmentsSingleSubject (1 of 1)
>>> pattern: [3] GG
>>> subject: [1] GG
>>> score: -32.03649
>>>
>>> Is it possible to obtain the complete alignment including the
>>> (possibly) gapped ends?  In this case, that would be:
>>> --GG--
>>> AAGGAA
>>> For my purposes, having the complete gapped alignment is important.
>>
>> Hi Robert -- Not an expert on pairwiseAlignment, but I think the return
>> value is a class that contains more information than it is showing
>>
>>      > xx = pairwiseAlignment(s2, s1, type="global")
>>      > class(xx)
>>      [1] "PairwiseAlignmentsSingleSubject"
>>      attr(,"package")
>>      [1] "Biostrings"
>>      > class?PairwiseAlignmentsSingleSubject
>>      > aligned(xx)
>>        A DNAStringSet instance of length 1
>>          width seq
>>      [1]     6 --GG--
>>
>> Martin
>>
>>>
>>> My question seems related to a previous post:
>>> https://stat.ethz.ch/pipermail/bioconductor/2012-February/043577.html
>>>
>>> Thank you in advance.
>>>
>>> Best wishes,
>>> Rob
>>>
>>
>>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list