[BioC] retrieving upstream/intronic sequences using biomaRt

Krys Kelly kak28 at cam.ac.uk
Tue Sep 19 12:35:07 CEST 2006


Hi Henrik,

A package?  The more one looks, the more one finds!  The attached
spreadsheet is very much a work in progress and a bit messy and incomplete.
We started it after a very quick review of the literature, so it is also far
from comprehensive.  However, it will probably give you more than enough
information to get started. 

This review should be helpful:

Tompa et al (2005) Assessing computational tools for the discovery of
transcription factor binding sites. Nature Biotechnology 23(1) 137-144. 

The three 'old-timer' programs that everyone seems to use are AlignACE, Meme
and Consensus. And we have also been using Weeder, Sombrero and NestedMica.
Be aware that some of the programs (e.g. AlignACE) can give quite different
answers on different runs even with the same parameters. And the different
programs can give very different answers. I am aware that a number of people
(including ourselves) use several of the programs and take the motifs that
turn up in most of the programs for further study.

There are also programs that search for known motifs (e.g. MAST (companion
to MEME), MSCAN, SiteSeer). Two well-known databases of Transcription Factor
Binding Sites are TRANSFAC and JASPAR.

Hope this helps.

Krys


Dr Krystyna A Kelly
University of Cambridge
Department of Pathology
Molteno Building, Tennis Court Road
Cambridge CB2 1QP
Tel:    01223 333331
Email: kak28 at cam.ac.uk
 

-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Henrik Hornshøj
Jensen
Sent: 19 September 2006 10:33
To: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] retrieving upstream/intronic sequences using biomaRt

Any of you guys know a package that will predict regulatory sites in
upstream regions?

Regards,
Henrik
 


-----Oprindelig meddelelse-----
Fra: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] På vegne af Steffen Durinck
Sendt: Wednesday, September 13, 2006 2:25 PM
Til: Shamit Soneji
Cc: BioC
Emne: Re: [BioC] retrieving upstream/intronic sequences using biomaRt

Hi Shamit,

Yes, with biomaRt you can get the upstream sequences but currently not the
intronic sequences.
 Try:

library(biomaRt)
ensmart = useMart("ensembl",dataset="hsapiens_gene_ensembl")
getSequence( id="ENSG00000139618", type="ensembl",mart = ensmart, seqType =
"5utr")

Cheers,
Steffen


Shamit Soneji wrote:
> Is it possible using biomaRt (or any other R/BioC means) to download 
> the upstream and intron sequences for any given ensembl ID?
>
> I know this can be done just using straight biomart, but a facility 
> like this from R would be very useful if one wants to search for TF 
> binding sites.
>
> Many thanks
>
> Shamit
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>   


--
Steffen Durinck, Ph.D.

Oncogenomics Section
Pediatric Oncology Branch
National Cancer Institute, National Institutes of Health
URL: http://home.ccr.cancer.gov/oncology/oncogenomics/

Phone: 301-402-8103
Address:
Advanced Technology Center,
8717 Grovemont Circle
Gaithersburg, MD 20877

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor


More information about the Bioconductor mailing list