[BioC] Fwd: the diffrent alignment result between pairwiseAlignment() in Biostrings package and EBI-webserver

Valerie Obenchain vobencha at fhcrc.org
Sat Dec 31 02:29:25 CET 2011


Hi Zongfu,

I'm still looking into this but wanted to give you an update since you 
have been waiting.

I believe these differences in output occur when multiple pairwise 
alignments produce the maximum alignment score. In this case a decision 
must be made about which alignment to report. As explained on the man 
page, pairwiseAlignment() reports the alignment with the smallest 
initial deletion whose mismatches occur before its insertions and deletions.

p <- "AGTA"
s <- "ACCTA"

 > pairwiseAlignment(p, s, type="global")
Global PairwiseAlignedFixedSubject (1 of 1)
pattern: [1] AG-TA
subject: [1] ACCTA
score: -13.95402

The 'G' could align to either 'C' but the alignment reported has the  
'G' before the insertion/deletion. This is consistent with the rules 
stated on the pairwiseAlignment() man page. The same situation occurs 
with the "ND" in your example below.  You may want to contact EBI 
support to ask how they choose to report an alignment when more than one 
give the max score.

I'm not sure it's realistic to expect a perfect match between the output 
of EBI and pairwiseAlignment(). The EBI interface for the global 
algorithm has additional parameters that pairwiseAlignment() does not 
and I'm not sure how these are used. The global-local and local-global 
options in pairwiseAlignment() are not the same as the straight global 
and local and are not, as far as I can see, offered through the EBI 
interface.

Valerie




On 12/29/2011 08:46 PM, Valerie Obenchain wrote:
> On 12/28/11 20:36, cao zongfu wrote:
>> Anyone can help me?Thanks.
>>
>> Zongfu
>
> Hi Zongfu,
>
> Martin is out of the office for the holidays. I'm looking at this and 
> hope to have an answer for you soon.
>
> Valerie
>>
>>
>> ---------- Forwarded message ----------
>> From: cao zongfu<caozongfu at gmail.com>
>> Date: 2011/12/24
>> Subject: Re: [BioC] the diffrent alignment result between
>> pairwiseAlignment() in Biostrings package and EBI-webserver
>> To: Martin Morgan<mtmorgan at fhcrc.org>
>>
>>
>> Hi, Martin,
>>       For exmple,
>>
>> require("Biostrings")
>> data(BLOSUM62)
>> x1 =
>> "QVQLVQSGAEVKKPGSSVKVSCKTSGDTFSTYAISWVRQAPGQGLEWMGGIIPIFGKAHYAQKFQGRVTITADESTSTAYMELSSLRSEDTAVYFCARKFHFVSGSPFGMDVWGQGTTVTVSS" 
>>
>> x2 =
>> "QVQLVESGGDVVQPGGSLRLSCAASGVAFSNYGMHWVRQAPGKGLEWVAVIWYDGSNKYYADSVKGRFTISRDNSKNMLYLQMNSLRAEDTAMYYCARNDDYWGQGTLVTVSS" 
>>
>>
>> alm<- pairwiseAlignment(x1, x2, substitutionMatrix=BLOSUM62, gapOpening=
>> -10, gapExtension= -0.5,type="global-local")
>> alm
>>
>> pattern: [1]
>> QVQLVQSGAEVKKPGSSVKVSCKTSGDTFSTYAISWVRQAPGQGLEWMGGIIPIFGKAHYAQKFQGRVTITADESTSTAYMELSSLRSEDTAVYFCARKFHFVSGSPFGMDVWGQGTTVTVSS 
>>
>> subject: [1]
>> QVQLVESGGDVVQPGGSLRLSCAASGVAFSNYGMHWVRQAPGKGLEWVAVIWYDGSNKYYADSVKGRFTISRDNSKNMLYLQMNSLRAEDTAMYYCAR 
>>
>> ND----------DYWGQGTLVTVSS
>>
>> And the result from EBI webserver are as follows
>> x1      1 QVQLVQSGAEVKKPGSSVKVSCKTSGDTFSTYAISWVRQAPGQGLEWMGG     50
>> x2      1 QVQLVESGGDVVQPGGSLRLSCAASGVAFSNYGMHWVRQAPGKGLEWVAV     50
>>
>> x1     51 IIPIFGKAHYAQKFQGRVTITADESTSTAYMELSSLRSEDTAVYFCARKF    100
>> x2     51 IWYDGSNKYYADSVKGRFTISRDNSKNMLYLQMNSLRAEDTAMYYCAR--     98
>>
>> x1    101 HFVSGSPFGMDVWGQGTTVTVSS    123
>> x2     99 --------NDDYWGQGTLVTVSS    113
>>
>>
>> we can find, the position of "ND" are different in the two alignment
>> result. one is before the gap, and the other after the gap.
>> Thanks,
>>
>> Zongfu
>>
>>
>>
>>
>>
>>
>> 2011/12/24 Martin Morgan<mtmorgan at fhcrc.org>
>>> On 12/22/2011 11:54 PM, cao zongfu wrote:
>>>> Dear Prof., Hi
>>>>         I have compared the results of pairwise alignment from
>>>> pairwiseAlignment() in Biostrings package and  the webserver of
>>>>
>> http://www.ebi.ac.uk/Tools/services/web/toolform.ebi?tool=emboss_needle&context=protein 
>>
>> ,
>>>> I found the results are different.
>>>>
>>>> EBI-web parameters are as follows,
>>>> #matrix=BLOSUM62
>>>> #GAP OPEN=10
>>>> #GAP EXTEND=0.5
>>>> #OUTPUT FORMAT="PAIR"
>>>> #END GAP PENALTY=FALSE
>>>> #END GAP OPEN=10
>>>> #END GAP EXTEND=0.5
>>>> and the parameters in pairwiseAlignment() are as follows,
>>>> alm<- pairwiseAlignment(x1, x2, substitutionMatrix=BLOSUM62, 
>>>> gapOpening=
>>>> -10, gapExtension= -0.5,type="global-local")
>>>>
>>>> I have tried to set the type="global", they all wored well, but the
>>>> alignment result is still different.I want to know how to set other
>>>> parameters in order to get the identical alignment result as EBI
>> webserver?
>>>
>>> can you provide a more explicit example, with sequences x1 and x2 as 
>>> well
>> as the output from the two programs? Martin
>>>> Thanks,
>>>>
>>>
>>> -- 
>>> Computational Biology
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>>
>>> Location: M1-B861
>>> Telephone: 206 667-2793
>>
>>
>>
>> -- 
>>
>> Zongfu Cao
>>
>> BeiGene(Beijing) Co.,Ltd
>> No.30 Science Park Road
>> Zhong-Guan-Cun Life Science Park
>> Changping District, Beijing P.R.China
>> Postal Code: 102206
>>
>>
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list