[BioC] Fwd: summarizeOverlaps behavior on RNA-Seq paired end stranded data

Valerie Obenchain vobencha at fhcrc.org
Wed Oct 2 19:28:13 CEST 2013


Hi,

On 10/02/2013 12:35 AM, Abhishek Pratap wrote:
> Hi All
>
> Just wanted to check on the expected behavior of the summarizeOverlaps
> function in the GenomicRanges package for the following cases.
>
>
> 1. In the case of paired end data does it count each read pair as 1 or 2
> against a feature.

Pairs are counted as a single 'hit' reguardless if one or both mates 
overlap the feature.

>
> 2. For stranded protocols of RNA-Seq does it take it account the opposite
> strand of the mate of a read when counting or does it take only one read
> matching the strand into consideration.

When counting, pairs are treated as though they are from the same 
strand. In general, you can ignore the strand when counting by setting 
the 'ignore.strand=TRUE'.

>
> 3. For second strand RNA-Seq protocol where the read-1 matches to the
> opposite strand of gene will summarizeOverlap work ?

I'm not sure what you mean. Please provide an example.


summarizeOverlaps() reads the data from a BAM into a GAlignmentPairs or 
GAlignmentsList container for counting. You can investigate the behavior 
using a small test case.

## paired-end record
ga1 <- GAlignments("chr1", 1L, "10M", strand("+"))
ga2 <- GAlignments("chr1", 15L, "11M", strand("-"))
galp <- GAlignmentPairs(ga1, ga2, TRUE)

## annotation
ann <- GRanges("chr1", IRanges(c(1, 5, 12, 20), c(25, 20, 14, 30)), "-")

## all with mode="Union"
se1 <- summarizeOverlaps(ann[1], galp, ignore.strand=TRUE)
 > assays(se1)$counts
      reads
[1,]     1
se2 <- summarizeOverlaps(ann[1], galp, ignore.strand=FALSE)
 > assays(se2)$counts
      reads
[1,]     0
se3 <- summarizeOverlaps(ann[4], galp, ignore.strand=TRUE)
 > assays(se3)$counts
      reads
[1,]     1


Valerie


>
> Thanks!
> -Abhi
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list