[BioC] motif searching with variable length gaps

Houseman, Heather Heather.Houseman at vai.org
Thu Nov 1 19:10:16 CET 2007


Herve,

My ultimate goal is to find motifs in different sequences that are similar to the ones below.

TACGTGCTGTCTCACACAG
GACGTGACTCGGACCACAT
TACGTGGGT--TTCCACAG
TACGTGAC----CACACAC
TACGTGC-------CACAG
CACGTGC-------CACAC
GGCGTGAGC-----CACCG
GGCGTGGGAGCG--CACAG
TACGTG------CACACAG

To start off, I'm inserting the motifs above into random sequences to see if I can get cosmo to return those motifs.  Once I get that procedure to work, I'd like to use it to apply it to "real" sequences and hopefully return motifs that look similar to the ones above.

Here's the cosmo code I'm using:

res = cosmo(seqs = seqs, minW = 12, maxW = 20, models = "OOPS")

Is this more along the lines of multiple sequence alignment and not something that I can use cosmo for?

Thanks!

Heather

-----Original Message-----
From: Herve Pages [mailto:hpages at fhcrc.org]
Sent: Thursday, November 01, 2007 1:33 PM
To: Houseman, Heather
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] motif searching with variable length gaps

Hi Heather,

Can you please give some examples of your motifs?

Also showing us the code that you use with cosmo can be useful.

Even if the matchPattern() function in Biostrings doesn't let you control the number
of gaps, there might be workarounds, it all depends what your motifs really look
like. And we need use cases anyway so we know where to put our efforts. Thanks!

H.


Houseman, Heather wrote:
> Dear Bioconductor mailing list:
>
> I've been using cosmo to look for motifs.  I'd like to search for motifs that have a variable length of gaps in the middle. If I specify a range of motif widths with the cosmo function, it uses the width with the lowest BIC value and searches for motifs of only that width.  My dilemma is that the motifs I'm looking for are of variable width.
>
> Thanks in advance for any help!
>
> Heather
>
> This email message, including any attachments, is for ...{{dropped:16}}



More information about the Bioconductor mailing list