[BioC] questions about matchPattern and vmatchPattern

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Nov 1 18:44:28 CET 2012


Hi,

You'd find it very informative if you did a bit more exploratory
analysis (and documentation reading!) ... I think you will find that
you can answer most of these question yourself.

For example, see inline:

On Thu, Nov 1, 2012 at 1:21 PM, wang peter <wng.peter at gmail.com> wrote:
> dear ALL:
>        Please this sample
> subject = "TGCATTT"
> Rpattern = "TGCAATTT"
> result <- matchPattern(Rpattern, subject, max.mismatch= 4, min.mismatch=0)
> result
>   Views on a 7-letter BString subject
> subject: TGCATTT
> views:
>     start end width
> [1]     0   7     8 [ TGCATTT]
> [2]     1   8     8 [TGCATTT ]
>
>
>
> is the start position and end position on the subject or pattern?

R> matchPattern("GATACA", "GTTGACGATAGATACATTCAAGATACAAA")
  Views on a 29-letter BString subject
subject: GTTGACGATAGATACATTCAAGATACAAA
views:
    start end width
[1]    11  16     6 [GATACA]
[2]    22  27     6 [GATACA]

Given that the pattern is only 6 NT long, do you think the result
returned is on the subject or the pattern?

> and for vmatchPattern result
> if one pattern has many hits on one sequence, does it return only
> one hit or all of hits as results?

Technically neither.

If you look at the Value seciton of ?matchPattern, you will see that
it returns an MIndex object.

"But what's an MIndex object," you ask?

R> ?MIndex

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list