[BioC] how to extract promoter regions and detect motif occurrence counts?

Steve Lianoglou mailinglist.honeypot at gmail.com
Sun Oct 23 19:22:44 CEST 2011


Hi,

On Sun, Oct 23, 2011 at 12:44 PM, Edward Turner <edtuer at gmail.com> wrote:
> Hi,
>
>  I'm new to bioconductor. Could anyone give some hints which package(s) I
> should use for the following purposes:
>
> 1. Extract the promoter regions of given 100 genes with Entre ID
> 2. Count the occurrence of given motif in promoter region of each gene,
> respectively.

Get familiar with:

(1) GenomicFeatures
(2) GenomicRanges
(3) IRanges
(4) Biostrings
(5) The BSgenome.*.* package for the organism you are working with.

(1) You will get the location of promoters using GenomicFeatures,
which you will define yourself as XX bp upstream from the
transcription start site of the gene (GenomicFeatures gives you, among
other things, transcription bounds).

(2) The results from (1) will be returned to you in a data structure
that is defined in GenomicRanges, which, in turn, are objects that
rely heavily on the IRanges infrastructure

(3) The biostrings + BSgenome.*.* packages will allow you to find the
sequences associated with the promoter ranges you defined from (1) and
look for the occurrence of patterns you are looking for in them.

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list