[BioC] find overlap of bed files of different length

Duke duke.lists at gmx.com
Mon Jan 31 16:30:08 CET 2011


On 1/30/11 9:34 AM, Martin Morgan wrote:
> On 01/29/2011 04:33 PM, Duke wrote:
>> Hi all,
>>
>> I need to find overlap between a text file (BED format) and a gene
>> reference. The BED file contains sequence of different lengths, and I
>> need to find all the sequences that lye inside the gene (meaning
>> overlapping percentage is 100%). I found findOverlaps function in
>> GenomicRanges, but the parameter to control overlap (minoverlap) does
>> not let me control percentage.
> the 'tyoe='within"' argument is available for
> findOverlaps,IRanges,IRanges-method; you could use this by extracting
> the ranges(gr) from your query / subject for each seqname / strand
> subset you were interested in.
>
> The development version of GenomicRanges also now supports
> findOverlaps,GenomicRanges,GenomicRangaes-method, so using the
> development version of R is also a solution.

Thanks Martin for your suggestion. After posting the question, I also 
found out findOverlaps for IRanges method has type="within". 
Unfortunately "within" is just one case that I want to make it to work. 
What I really want is to control the overlap percentage (quite similar 
to minOverlap, but in percentage). Does the development version of 
GenomicRanges support that? Or do you know of any other packages 
supporting percentage overlap?

Thanks,

D.



More information about the Bioconductor mailing list