[BioC] Minimal region of overlap between two GRanges objects

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Tue Dec 13 20:42:28 CET 2011


Two things

(1) watch out that the return object from intersect is a normalized
IRanges.  So adjacent Ranges will be merged.  Whether or not that is a
problem depends on what representation you are using.

(2) Tim: you could use coverage() and slice() on an object you get by
c()'ing the individual level IRanges together.  Start by ignoring the
score column.

Kasper

On Tue, Dec 13, 2011 at 1:31 PM, Tim Triche, Jr. <tim.triche at gmail.com> wrote:
> Supposing I wanted to do this with several hundred tumors to look for
> hotspots in certain subtypes, is there a convenient way to do this
> internally with one or two great big GRanges?  Or is this the sort of
> situation where I should use a genoset?
>
> Example (CNV.GR is a GRanges for a tumor type, built from GISTIC CNV calls
> achieving regional statistical significance):
>
>> EZH2 <- getCNV('EZH2', CNV.GR) # subsetByOverlaps(CNV.GR, getExons(gene,
> basesUp))
>> head(EZH2)
> GRanges with 6 ranges and 1 elementMetadata value:
>               seqnames                 ranges strand |     score
>                  <Rle>              <IRanges>  <Rle> | <numeric>
>  2915.Del.170     chr7 [140579242, 159118025]      * |        -1
>   2952.Del.93     chr7 [141787637, 157040032]      * |        -1
>  2929.Del.258     chr7 [141956277, 148251831]      * |        -1
>   2874.Del.44     chr7 [142484285, 159127005]      * |        -1
>   2857.Del.97     chr7 [142486080, 154885611]      * |        -1
>   2805.Del.78     chr7 [142486080, 154970916]      * |        -1
>> sum(score(EZH2))
> [1] -22
>
> So exons of EZH2 are frequently deleted in the particular set of samples
> I'm looking at.  Myc is amplified a great deal.  etc. But I knew what I
> wanted to look for in those cases.  Suppose I instead want to blob together
> and scan various tumors by common mutations, or known subtypes of driver
> mutations.  It's handy for me to just slap a +1 or -1 in the 'score' column
> and add it up.  It would be handier still for me to do running tallies of
> this... it seems like the sort of thing that people would have solved years
> ago, I just don't know precisely what to look for in terms of documentation
> for it.
>
> Any assistance is appreciated.
>
> --t
>
>
>
> On Tue, Dec 13, 2011 at 8:55 AM, Daniel Brewer <Daniel.Brewer at icr.ac.uk>wrote:
>
>> Thanks Martin,
>>
>> Yes you're right a simple intersect works great.  I had a suspicion I was
>> missing something very obvious.  Many thanks.
>>
>> Dan
>>
>> On 13 Dec 2011, at 4:49 PM, Martin Morgan wrote:
>>
>> > On 12/13/2011 07:00 AM, Daniel Brewer wrote:
>> >> Dear mailing list,
>> >>
>> >> I have two GRanges objects each containing segments of copy number
>> >> change for a tumour.  I would like to find regions where there is
>> >> copy number change in both the tumours.  I though findOverlaps was
>> >> giving me what I wanted but soon found that findOverlaps(A,B) !=
>> >> findOverlaps(B,A).  I am sure there must be a reasonable way to do
>> >> this but I cannot find it.
>> >
>> > Hi Dan --
>> >
>> >> a = GRanges("chrA", IRanges(11, 20), seqlengths=c(chrA=100))
>> >> b = GRanges("chrA", IRanges(16, 25), seqlengths=c(chrA=100))
>> >> intersect(a, b)
>> > GRanges with 1 range and 0 elementMetadata values:
>> >       seqnames    ranges strand
>> >          <Rle> <IRanges>  <Rle>
>> >   [1]     chrA  [16, 20]      *
>> >   ---
>> >   seqlengths:
>> >    chrA
>> >     100
>> >
>> > but maybe a working example from your end would help clarify what you're
>> > after / where things go wrong for you with findOverlaps.
>> >
>> > Martin
>> >
>> >>
>> >> Thanks
>> >>
>> >> Dan
>> >>
>> >
>> >
>> > --
>> > Computational Biology
>> > Fred Hutchinson Cancer Research Center
>> > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>> >
>> > Location: M1-B861
>> > Telephone: 206 667-2793
>>
>> --
>> **************************************************************
>>
>> Daniel Brewer, Ph.D.
>>
>> Institute of Cancer Research
>> Molecular Carcinogenesis
>> MUCRC
>> 15 Cotswold Road
>> Sutton, Surrey SM2 5NG
>> United Kingdom
>>
>> Tel: +44 (0) 20 8722 4109
>>
>> **************************************************************
>>
>>
>>
>>
>> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
>> Company Limited by Guarantee, Registered in England under Company No.
>> 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
>>
>> This e-mail message is confidential and for use by the...{{dropped:20}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list