[BioC] DiffBind counts
Gordon Brown
Gordon.Brown at cruk.cam.ac.uk
Thu Apr 3 17:13:31 CEST 2014
Hi, Naomi et al,
This is a bug in DiffBind's duplicate-removal code. The bug is fixed in
BioC 2.13, and will (I hope) make it into the last build, if there is one
more. It's also fixed in the development stream, so will make it into the
next release.
Counting without removing duplicates is (afaik) correct.
Sorry for the inconvenience and many thanks for bringing it to our
attention.
Cheers,
- Gord
On 2014-04-03 11:00, "bioconductor-request at r-project.org"
<bioconductor-request at r-project.org> wrote:
>------------------------
>
>Message: 24
>Date: Wed, 02 Apr 2014 21:50:50 -0400
>From: Naomi Altman <naomi at stat.psu.edu>
>To: Bioconductor mailing list <bioconductor at r-project.org>
>Subject: [BioC] DiffBind counts
>Message-ID: <533CBE7A.6050101 at stat.psu.edu>
>Content-Type: text/plain
>
>
>I am trying to understand what DiffBind is doing. Notice that I executed
>exactly the same command on the same data twice and got different counts.
>
>
>counts=dba.count(mydata,minOverlap=3,score="DBA_SCORE_READS",
>bRemoveDuplicates=TRUE, bCorPlot=FALSE)
>head(counts$peaks[[1]])
>
> Chr Start End Score RPKM Reads cRPKM cReads
>1 chr19 4113108 4113591 1 34.83796 13 29.97905 12
>2 chr19 4878390 4879327 126 192.01349 139 16.92521 13
>3 chr19 4961642 4962405 47 103.48129 61 22.59234 14
>4 chr19 5724175 5724774 46 121.00902 56 20.72008 10
>5 chr19 5798432 5799635 137 163.54396 152 15.47547 15
>6 chr19 5801387 5802104 90 176.91451 98 13.46340 8
>
>
>
>countsR1=dba.count(mydata,minOverlap=3,score="DBA_SCORE_READS",
>bRemoveDuplicates=TRUE, bCorPlot=FALSE)
>head(countsR1$peaks[[1]])
>
>
>
>Chr Start End Score RPKM Reads cRPKM cReads
>1 chr19 4113108 4113591 1 34.84423 13 29.98011 12
>2 chr19 4878390 4879327 52 78.75352 57 6.62314 5
>3 chr19 4961642 4962405 44 98.40975 58 22.59313 14
>4 chr19 5724175 5724774 46 121.03080 56 20.72081 10
>5 chr19 5798432 5799635 141 164.64953 153 12.61009 12
>6 chr19 5801387 5802104 90 176.94635 98 13.46387 8
>
>
>R version 3.0.2 (2013-09-25)
>Platform: x86_64-w64-mingw32/x64 (64-bit)
>
>locale:
>[1] LC_COLLATE=English_United States.1252
>[2] LC_CTYPE=English_United States.1252
>[3] LC_MONETARY=English_United States.1252
>[4] LC_NUMERIC=C
>[5] LC_TIME=English_United States.1252
>
>attached base packages:
>[1] parallel stats graphics grDevices utils datasets methods
>[8] base
>
>other attached packages:
>[1] DiffBind_1.8.4 GenomicRanges_1.14.4 XVector_0.2.0
>[4] IRanges_1.20.7 BiocGenerics_0.8.0
>
>loaded via a namespace (and not attached):
> [1] amap_0.8-12 bitops_1.0-6 caTools_1.16
> [4] edgeR_3.4.2 gdata_2.13.2 gplots_2.12.1
> [7] gtools_3.3.1 KernSmooth_2.23-12 limma_3.18.13
>[10] RColorBrewer_1.0-5 stats4_3.0.2 tools_3.0.2
>[13] zlibbioc_1.8.0
>
>
> [[alternative HTML version deleted]]
More information about the Bioconductor
mailing list