[BioC] a problem of trimLRPatterns still confused me

Wang Peter wng.peter at gmail.com
Fri Nov 30 05:17:24 CET 2012


sorry to disturb you again

but i am still feeling confused
see this problem

subject = "GGTAACTTTTCTGACACCTCCTGCTTAAAACCCCAAAGGTCAGAAGGATCGTGAGGCCCCGCTTTCACGGTCTGTATTCGTACTGAAAATCAAGATCAAG"

Rpattern = "AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG"

max.mismatchs <- 0.2*1:nchar(DNAString(Rpattern))

trimLRPatterns(Rpattern = Rpattern, subject = subject,
max.Rmismatch=max.mismatchs, with.Rindels=TRUE)
[1] "GGTAACTTTTCTGACACCTCCTGCTTAAAACCCCAAAGGTCAGAAGGATCGTGAGGCCCCGCTTTCACGGTCTGTATTCGTACTGAAAAT"

CAAGATC  AAG
   AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG

overlap length is 10 bp, so the allowed distance is 10*0.2=2

so it should trim the "AGATCAAG", not include the "CA"

I am confused why?



-- 
shan gao
Room 231(Dr.Fei lab)
Boyce Thompson Institute
Cornell University
Tower Road, Ithaca, NY 14853-1801
Office phone: 1-607-254-1267(day)
Official email:sg839 at cornell.edu
Facebook:http://www.facebook.com/profile.php?id=100001986532253



More information about the Bioconductor mailing list