[BioC] find overlap of bed files of different length

Hervé Pagès hpages at fhcrc.org
Tue Feb 1 23:26:31 CET 2011


Hi,

On 02/01/2011 11:08 AM, Michael Lawrence wrote:
[...]
> This is a reasonable request. As Kasper mentioned, it's possible with post
> processing.
>
> E.g.:
>
> m<- findOverlaps(query, subject)
> percentOverlap<- width(ranges(m, query, subject)) /
> width(query)[queryHits(m)]
> keep<- percentOverlap>  cutoff
>
> Perhaps someone up North could add this to IRanges/GenomicRanges?

Would be my pleasure. Yes the functionality provided by this
ranges,RangesMatching method is very useful but maybe we should try
to give it a little bit more exposure. Right now it's kind of "hidden"
in the man page for RangesMatching (which you will get with something
like ?`RangesMatching-class` or ?`ranges,RangesMatching-method`) and
not mentioned in any vignette AFAIK. Also it works only if query and
subject are both Ranges objects but not with GRanges objects.

So I propose to:

   1. Rename this to something like overlaps(), or overlappingRanges(),
      or ... (fill the blank), and deprecate the ranges,RangesMatching
      method.
      overlaps() would be a new generic with the m,query,subject
      signature.
      The current name is probably part of the reason why nobody (except
      the original author of the method) knew about it.
      I personally find it unexpected that a quite non-specific name
      like "ranges" is used for this when it generally plays the role
      of an accessor to get/set the ranges of containers like IRanges,
      RangedData, GRanges, etc...

   2. Add a "overlaps" method for
      RangesMatching,GenomicRanges,GenomicRanges.

   3. Illustrate how to use this in the vignettes of GenomicRanges.

Would that be OK?

H.

>
> Michael
>
> D.
>>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list