[BioC] Coverage by base

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Oct 10 16:40:11 CEST 2011


Hi Rohan,

On Mon, Oct 10, 2011 at 10:28 AM, rohan bareja <rohan_1925 at yahoo.co.in> wrote:
> Hi,
>
>
>
> I want to sum the coverage on a per-gene basis, so that I
> could get the total number of reads in those intervals and the genes to which
> they belong.I have used viewSums which gives me the total number of reads,but I
> would like to get the information about genes.

You should familiarize yourself with the GenomicFeatures package and
build yourself a TranscriptDb for your organism & gene annotation
combination of interest.

You can then get all of the annotated genes/transcripts/etc. for your
organism into different flavors of GenomicRanges objects (easiest, I
guess, is a GRangesList).

If your reads are stored in a GRanges, IRanges, or similar data
structure, you can use the "countOverlaps" function with your
transcript GRangesList obect and your reads object to get what you are
after.

I'm deliberately being a bit vague (as in, not giving you exact code)
in order to encourage you to get more familiar with these packages
yourself, so you can better answer different flavors of these types of
questions as they continue to pop up in your analysis.

Hope that helps,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list