[BioC] Regarding extraction of 3' and 5'UTRs and exonic region of a gene.
hpages at fhcrc.org
Sat Jun 29 00:23:32 CEST 2013
Good that you mention KEGG and I should probably have mentioned the
KEGG.db package for step 1 of the proposed workflow. Even though I've
no direct experience with it. Unfortunately, my understanding is that
it's about to be deprecated (because of licensing issues). I heard
there are some alternatives though. Hopefully more knowledgeable people
will chime in with helpful suggestions.
On 06/27/2013 09:57 PM, Abdul Rawoof wrote:
> Thanks for your kind suggestion and I will try to follow your suggested
> workflow and obviously it will take time to learn all this packages as I
> never go through it.
> One more thing I want to ask that how can I download the list of all
> available cancer genes for human from Kegg database for wnt signaling
> Please forgive me if I asked any senseless question as I have not tried
> that mentioned packages till now.
> Abdul Rawoof
> On Thu, Jun 27, 2013 at 11:01 PM, Hervé Pagès <hpages at fhcrc.org
> <mailto:hpages at fhcrc.org>> wrote:
> Hi Abdul,
> Suggested workflow:
> 1. Build the list of genes involved in the particular cancer you're
> interested in. Could be a vector of gene ids or transcript ids (not
> all transcripts are necessarily linked to a gene).
> Suggested tools (no exhaustive): GO.db and org.Hs.eg.db packages,
> maybe the DO.db package, etc... I'm not sure what would be the best
> tool for this. But maybe you already have your list of genes?
> 2. Use the TxDb.Hsapiens.UCSC.hg19.__knownGene + GenomicFeatures
> to extract the coordinates of the 5'UTRs and 3'UTRs.
> Use the fiveUTRsByTranscript() and threeUTRsByTranscript() functions
> for this. They'll return the result in a GRangesList object (you'll
> have to become a bit familiar with those objects first).
> 3. Use the BSgenome.Hsapiens.UCSC.hg19 package and the
> extractTranscriptsFromGenome() function from the GenomicFeatures
> package to extract the UTR sequences.
> The name of the function is misleading but it can be used to extract
> CDS or UTR sequences in addition to transcript sequences.
> If you've never used those tools before, it will take you some time to
> get familiarized with them. Your best friends are the man pages for the
> individual functions/classes you're going to run into (don't miss the
> examples section) and the vignettes in the GenomicRanges and
> GenomicFeatures package.
> Let us know if you have specific questions or run into specific problems
> (show us what you've done and explain the problem -- don't forget your
> Good luck,
> On 06/27/2013 01:58 AM, Abdul Rawoof wrote:
> Hello everyone,
> Could anyone show me the way how can I extract the *3' and 5'
> UTRs and
> exonic regions *of all *Human genes* from *Ensembl and Kegg
> database* that
> are involved in particular cancer specially *breast cancer *using
> Thanks in advance.
> Abdul Rawoof
> [[alternative HTML version deleted]]
> Bioconductor mailing list
> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> Search the archives:
> Hervé Pagès
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> E-mail: hpages at fhcrc.org <mailto:hpages at fhcrc.org>
> Phone: (206) 667-5791
> Fax: (206) 667-1319
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor