[BioC] DiffBind counts

Gordon Brown Gordon.Brown at cruk.cam.ac.uk
Thu Apr 3 17:13:31 CEST 2014


Hi, Naomi et al,

This is a bug in DiffBind's duplicate-removal code.  The bug is fixed in
BioC 2.13, and will (I hope) make it into the last build, if there is one
more.  It's also fixed in the development stream, so will make it into the
next release.

Counting without removing duplicates is (afaik) correct.

Sorry for the inconvenience and many thanks for bringing it to our
attention.

Cheers,

 - Gord


On 2014-04-03 11:00, "bioconductor-request at r-project.org"
<bioconductor-request at r-project.org> wrote:

>------------------------
>
>Message: 24
>Date: Wed, 02 Apr 2014 21:50:50 -0400
>From: Naomi Altman <naomi at stat.psu.edu>
>To: Bioconductor mailing list <bioconductor at r-project.org>
>Subject: [BioC] DiffBind counts
>Message-ID: <533CBE7A.6050101 at stat.psu.edu>
>Content-Type: text/plain
>
>
>I am trying to understand what DiffBind is doing. Notice that I executed
>exactly the same command on the same data twice and got different counts.
>
>
>counts=dba.count(mydata,minOverlap=3,score="DBA_SCORE_READS",
>bRemoveDuplicates=TRUE, bCorPlot=FALSE)
>head(counts$peaks[[1]])
>
>  Chr   Start     End Score      RPKM Reads    cRPKM cReads
>1 chr19 4113108 4113591     1  34.83796    13 29.97905     12
>2 chr19 4878390 4879327   126 192.01349   139 16.92521     13
>3 chr19 4961642 4962405    47 103.48129    61 22.59234     14
>4 chr19 5724175 5724774    46 121.00902    56 20.72008     10
>5 chr19 5798432 5799635   137 163.54396   152 15.47547     15
>6 chr19 5801387 5802104    90 176.91451    98 13.46340      8
>
>
>
>countsR1=dba.count(mydata,minOverlap=3,score="DBA_SCORE_READS",
>bRemoveDuplicates=TRUE, bCorPlot=FALSE)
>head(countsR1$peaks[[1]])
>
>  
>
>Chr   Start     End Score      RPKM Reads    cRPKM cReads
>1 chr19 4113108 4113591     1  34.84423    13 29.98011     12
>2 chr19 4878390 4879327    52  78.75352    57  6.62314      5
>3 chr19 4961642 4962405    44  98.40975    58 22.59313     14
>4 chr19 5724175 5724774    46 121.03080    56 20.72081     10
>5 chr19 5798432 5799635   141 164.64953   153 12.61009     12
>6 chr19 5801387 5802104    90 176.94635    98 13.46387      8
>
>
>R version 3.0.2 (2013-09-25)
>Platform: x86_64-w64-mingw32/x64 (64-bit)
>
>locale:
>[1] LC_COLLATE=English_United States.1252
>[2] LC_CTYPE=English_United States.1252
>[3] LC_MONETARY=English_United States.1252
>[4] LC_NUMERIC=C
>[5] LC_TIME=English_United States.1252
>
>attached base packages:
>[1] parallel  stats     graphics  grDevices utils     datasets  methods
>[8] base
>
>other attached packages:
>[1] DiffBind_1.8.4       GenomicRanges_1.14.4 XVector_0.2.0
>[4] IRanges_1.20.7       BiocGenerics_0.8.0
>
>loaded via a namespace (and not attached):
>  [1] amap_0.8-12        bitops_1.0-6       caTools_1.16
>  [4] edgeR_3.4.2        gdata_2.13.2       gplots_2.12.1
>  [7] gtools_3.3.1       KernSmooth_2.23-12 limma_3.18.13
>[10] RColorBrewer_1.0-5 stats4_3.0.2       tools_3.0.2
>[13] zlibbioc_1.8.0
>
>
>	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list