[BioC] a question about the low level match function

wang peter wng.peter at gmail.com
Tue Nov 6 21:02:00 CET 2012


dear ALL, harry and steve:
     i am so sorry to disturb you again.but this time,i read the mannu
and some source coding carefully. but still confused with the process
how trimLRPatterns works?
     i trace back to the function

Biostrings:::.computeTrimEnd
showMethods(which.isMatchingEndingAt, includeDefs=TRUE)
Biostrings:::.matchPatternAt

    if (is(subject, "XString"))
        .Call2("XString_match_pattern_at", pattern, subject,
            at, at.type, max.mismatch, min.mismatch, with.indels,
            fixed, ans.type, auto.reduce.pattern, PACKAGE = "Biostrings")
    else .Call2("XStringSet_vmatch_pattern_at", pattern, subject,
        at, at.type, max.mismatch, min.mismatch, with.indels,
        fixed, ans.type, auto.reduce.pattern, PACKAGE = "Biostrings")

i think it will call the low level coding.

for example:
trimLRPatterns(Rpattern = Rpattern, subject = subject,
max.Rmismatch=0.1, with.Lindels=TRUE)

subject = "TATAGTAGATATTGGAATAGTACTGTAGGCACCATCAATAGATCGGAA"
Rpattern =              "GAATAGTACTGTAGGCACCATCAATAGATCGGAA"

then the function will change max.Rmismatch to
max.Rmismatch= as.integer(max.Rmismatch*1:nchar(Rpattern))
 [1] 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3

as i know the process is,it try to get the distance between p and s

p = "GAATAGTACTGTAGGCACCATCAATAGATCGGAA" allowing 3 mismatch
s = "GAATAGTACTGTAGGCACCATCAATAGATCGGAA"

p = "AATAGTACTGTAGGCACCATCAATAGATCGGAA"  allowing 3 mismatch
s = "GAATAGTACTGTAGGCACCATCAATAGATCGGA"
...
p = "A"  allowing 0 mismatch
s = "G"

but what does the parameter at mean?

-- 
shan gao
Room 231(Dr.Fei lab)
Boyce Thompson Institute
Cornell University
Tower Road, Ithaca, NY 14853-1801
Office phone: 1-607-254-1267(day)
Official email:sg839 at cornell.edu
Facebook:http://www.facebook.com/profile.php?id=100001986532253



More information about the Bioconductor mailing list