[BioC] find overlap of bed files of different length

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Tue Feb 1 18:07:10 CET 2011


Well, clearly I have not done it, but I would expect that a decent
implementation of my method would take less than 2 minutes (although
it depends on length of the stuff in the BED file you started with).
At least the computational load should not be much more than running
findOverlaps.

Kasper

On Tue, Feb 1, 2011 at 10:06 AM, Duke <duke.lists at gmx.com> wrote:
> On 1/31/11 1:20 PM, Kasper Daniel Hansen wrote:
>>
>> Use findOverlaps to find all cases.  This is usually the hard and big
>> computation.  Then use for example pintersect() to compute the actual
>> overlap in percent.  There might be some tedious coding involved.
>
> Thanks for your suggestion Kasper, though honestly I have not tried it yet.
> But based on what Martin and you suggested, I thought the final code will
> not run fast because of extracting to strand/subset and running each.
> Especially my task is a little more complicated: I need to find gene
> expressions (counting sequences in exonic regions of each gene). I also gave
> BEDTools a try, but it does not fulfil my needs (extremely slow for a gene
> list of 28k).
>
> I ended up with coding a c++ code to do the job. Thanks for all of your
> suggestions and helps guys.
>
> D.
>



More information about the Bioconductor mailing list