[BioC] DiffBind and GRanges error extracting overlapping peaks using dba.overlap

Matt Zinkgraf mzinkgraf at gmail.com
Thu Nov 14 19:14:46 CET 2013


Hi Rory
That did the trick. Thanks
Matt

-----Original Message-----
From: Rory Stark [mailto:Rory.Stark at cruk.cam.ac.uk] 
Sent: Thursday, November 14, 2013 10:08 AM
To: Matt Zinkgraf
Cc: bioconductor at r-project.org
Subject: Re: DiffBind and GRanges error extracting overlapping peaks using dba.overlap

Hi Matt-

Yep, it's a bug!

If you plot this as a Venn, you can see one peakset has zero elements:

> dba.plotVenn(chip, 16:19)


There's no check when this is converted to a GRanges object. I'll fix this and check it in soon. In the mean time, you can work around this by using data frames instead of GRanges, either by:

> chip.OL = dba.overlap(chip,
>c(16,17,18,19),mode=DBA_OLAP_PEAKS,DataType=DBA_DATA_FRAME)

or by changing the default data type:

> chip$config$DataType = DBA_DATA_FRAME


I tend to run with DBA_DATA_FRAME as the default, which is probably why I didn't spot this bug before.

Cheers-
Rory


On 14/11/2013 16:00, "Matt Zinkgraf" <mzinkgraf at gmail.com> wrote:

>Hi Roy
>Thanks for the response.  The chip object can be found at 
>https://dl.dropboxusercontent.com/u/96655685/chip.rdata
>
>Matt
>
>-----Original Message-----
>From: Rory Stark [mailto:Rory.Stark at cruk.cam.ac.uk]
>Sent: Thursday, November 14, 2013 4:05 AM
>To: mzinkgraf at gmail.com
>Cc: bioconductor at r-project.org
>Subject: Re: DiffBind and GRanges error extracting overlapping peaks 
>using dba.overlap
>
>Hi Matt-
>
>Your code looks good -- this looks like a bug. It must be something 
>about the specific 4-way overlap that you are doing as I can't 
>reproduce it with some datasets I have.
>
>Is there a way you can share the DiffBind Object ("chip") with me so I 
>can debug it? Dropbox perhaps?
>
>Cheers-
>Rory
>
>On 14/11/2013 01:04, "Matt Zinkgraf [guest]" <guest at bioconductor.org>
>wrote:
>
>>
>>Hello
>>I am using DiffBind to identify consensus binding sites for multiple 
>>transcription factors that have biological replicates.  In addition, I 
>>want to investigate the overlap of binding sites across the 
>>transcription factors.  I am able to call consensus peaks for each 
>>transcription factor and calculate the overlap rate and plot overlaps 
>>with dba.plotVenn but I am getting an error from  GRanges when trying 
>>to extract the actual overlapping peaks using dba.overlap and 
>>DBA_OLAP_PEAKS. Any suggestions on why I am getting this error?
>>
>>Thanks
>>Matt
>>
>>> #load datasets
>>> chip= dba(sampleSheet="chip_datasets_testing.csv", peakCaller="bed")
>>a4142.1   a21 ARK2 1 bed
>>a4142.2   a22 ARK2 2 bed
>>a0304.1   a23 ARK2 1 bed
>>a0304.2   a24 ARK2 2 bed
>>r4748.1   r11 REV 1 bed
>>r8586.1   r21 REV 1 bed
>>r8586.2   r22 REV 2 bed
>>c4344.1   c11 PCN 1 bed
>>c4344.2   c12 PCN 2 bed
>>c4546.1   c21 PCN 1 bed
>>c4546.2   c22 PCN 2 bed
>>a3738.1   a11 ARK1 1 bed
>>a3738.2   a12 ARK1 2 bed
>>a3940.2   a14 ARK1 2 bed
>>a3940.1   a13 ARK1 1 bed
>>> 
>>> #create consensus peaks and plot overlap chip = dba.peakset(chip, 
>>> consensus = DBA_TREATMENT, minOverlap = 0.5)
>>Add consensus: ARK2
>>Add consensus: REV
>>Add consensus: PCN
>>Add consensus: ARK1
>>> 
>>> chip
>>19 Samples, 12123 sites in matrix (29678 total):
>>        ID       Condition Treatment Replicate Peak.caller Intervals
>>1  a4142.1             a21      ARK2         1         bed      3239
>>2  a4142.2             a22      ARK2         2         bed      2026
>>3  a0304.1             a23      ARK2         1         bed      2718
>>4  a0304.2             a24      ARK2         2         bed       581
>>5  r4748.1             r11       REV         1         bed      6958
>>6  r8586.1             r21       REV         1         bed       595
>>7  r8586.2             r22       REV         2         bed       869
>>8  c4344.1             c11       PCN         1         bed      8526
>>9  c4344.2             c12       PCN         2         bed       803
>>10 c4546.1             c21       PCN         1         bed      5524
>>11 c4546.2             c22       PCN         2         bed      5320
>>12 a3738.1             a11      ARK1         1         bed      7443
>>13 a3738.2             a12      ARK1         2         bed      7004
>>14 a3940.2             a14      ARK1         2         bed      5697
>>15 a3940.1             a13      ARK1         1         bed       761
>>16    ARK2 a21-a22-a23-a24      ARK2       1-2         bed      2030
>>17     REV     r11-r21-r22       REV       1-2         bed       548
>>18     PCN c11-c12-c21-c22       PCN       1-2         bed      3500
>>19    ARK1 a11-a12-a14-a13      ARK1       1-2         bed      6210
>>> 
>>> dba.overlap(chip,c(16,17,18,19), mode=DBA_OLAP_RATE)
>>[1] 10006  1733   327    11
>>> 
>>> chip.OL = dba.overlap(chip, c(16,17,18,19),mode=DBA_OLAP_PEAKS)
>>Error in validObject(.Object) :
>>  invalid class “GRanges” object: NROW(strand(x)) != length(x)
>>
>> -- output of sessionInfo():
>>
>>> sessionInfo()
>>R version 3.0.2 (2013-09-25)
>>Platform: x86_64-w64-mingw32/x64 (64-bit)
>>
>>locale:
>>[1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United
>>States.1252 [3] LC_MONETARY=English_United States.1252 [4] 
>>LC_NUMERIC=C [5] LC_TIME=English_United States.1252
>>
>>attached base packages:
>>[1] parallel  stats     graphics  grDevices utils     datasets  methods
>>[8] base     
>>
>>other attached packages:
>>[1] DiffBind_1.8.2       Biobase_2.22.0       GenomicRanges_1.14.3
>>[4] XVector_0.2.0        IRanges_1.20.5       BiocGenerics_0.8.0
>>
>>loaded via a namespace (and not attached):
>> [1] amap_0.8-7         bitops_1.0-6       caTools_1.16
>> [4] edgeR_3.4.0        gdata_2.13.2       gplots_2.12.1
>> [7] gtools_3.1.1       KernSmooth_2.23-10 limma_3.18.2
>>[10] RColorBrewer_1.0-5 stats4_3.0.2       tools_3.0.2
>>[13] zlibbioc_1.8.0
>>
>>--
>>Sent via the guest posting facility at bioconductor.org.
>
>



More information about the Bioconductor mailing list