[BioC] countGenomicOverlaps output

Valerie Obenchain vobencha at fhcrc.org
Sat Jul 23 19:27:06 CEST 2011


Hi Mete,

In the context of countGenomicOverlaps, a GRangesLit is used to 
represent reads with gaps in the CIGAR. The top level of a GRangesList 
represents a single read and the list elements represent the multiple 
segments of the read. Taking your qry object as an example, this list 
would represent a single read from 10 to 110 that is broken into three 
portions by gaps from 20-60 and 70-100. Only one of the three segments 
overlaps with the subject, hence the score of 1/3.

If I understand correctly, you have a GRangesList where each top level 
is a read and the list elements are the multiple ranges where the read 
aligns to the genome. In this example you have a read of length 10 that 
aligned to 3 different locations.

If you want to identify when all list elements map to the subject you 
could do something like

query <- GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100), 
c(20,70,110))),
                                   read2=GRanges(seq='2', 
IRanges(c(150,170), c(160,180))))
sub <- GRangesList(feature1=GRanges(seq='1', IRanges(10,30)),
                                 feature2=GRanges(seq='2', 
IRanges(145,195)))

olap <- countOverlaps(unlist(query), sub)
elements <- elementLengths(query)
lst <- split(olap, rep(seq_len(length(query)), elements))
counts <- sapply(lst, sum)
uniqueMap <- counts == elements


Valerie



On 07/22/11 11:55, Mete Civelek wrote:
> Hi All,
>
> I want countGenomicOverlaps to output a count of uniquely mapping reads
> within a genomic feature. Will setting the resolution parameter to 'none'
> allow countGenomicOverlaps to ignore reads which map to multiple locations
> in the genome? If so, countGenomicOverlaps doesn't behave the way I expect
> it to. I am using the Bioconductor GenomicRanges package version 1.4.6.
>
> Example:
>
> library(GenomicRanges)
> subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), strand='+'))
> qry = GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100),c(20,70,110)),
> strand='+'))
> countGenomicOverlaps(qry, subj, resolution='none')
>
> I would have expected the hit count to be 0 but instead it reports it as
> 1/3. Am I using this function correctly?
>
> Thanks,
>
> Mete
>
>
> IMPORTANT WARNING:  This email (and any attachments) is ...{{dropped:9}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list