[BioC] Intersecting two chromosomal ranges but keeping the unique ones

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Mar 12 16:45:09 CET 2012


Hi,

On Mon, Mar 12, 2012 at 11:41 AM, Yadav Sapkota <ysapkota at ualberta.ca> wrote:
> Hi,
>
> I am trying to create a redundant list from two different lists of
> chromosomal ranges as follow:
>
> *List 1:*
>   Chr Start Stop  chr1 0 98595  chr1 17012368 17012439  chr1 17017304
> 17029919  chr1 17246620 17250887
> *List 2:*
>   Chr Start Stop  chr1 82483 98595  chr1 17225296 17232195  chr1 17235775
> 17236214  chr1 17246096 17257401
>
>
> >From these two lists, I want to create one list by (i) intersecting the
> regions if they overlap between two lists but (ii) also keeping the ones
> unique in either of the lists. So far, I can do the intersection using
> GRanges (suggested by Martin Morgan) but can not get the unique regions
> only present in either groups. I am using the folowing code:
>
> source("http://bioconductor.org/biocLite.R")
> biocLite("GenomicRanges")
> library(GenomicRanges)
> rel=read.csv('file1.csv', header=T, sep=',')
> nonrel=read.csv(file2.csv', header=T, sep=',')
> subject=GRanges(rel$Chr, IRanges(rel$Start, rel$Stop))
> Query=GRanges(nonrel$Chr, IRanges(nonrel$Start, nonrel$Stop))
> result=as.data.frame(intersect (subject,Query))
> write.table(result, file='Result.csv', sep=',')
>
> Could anyone tell me how to insert a piece of code to keep the unique
> regions in either of the lists?

Sure, skim through the ?findOverlaps docs as well as its `see also` section.

For instance, a "low level" way to see which regions in subject don't
appear in query:

subject.only <- countOverlaps(subject, Query) == 0

But `match` and `%in%` are overloaded to work with *Ranges-like
objects, so you can play around with those to see which ones you
prefer.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list