[BioC] False positives due to GC content correction - DESeq2

QAMRA Aditi (GIS) qamraa99 at gis.a-star.edu.sg
Fri Aug 8 20:29:10 CEST 2014

Hi Mike,

Sorry seems like my message got cut midway. What I was saying was that I don't understand how can I estimate what could be the source of these false positives. Yes these are regions that I know are not differentially expressed.

I've attached the code for the analysis as well the dispersion plots.

Session Info -
R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] EDASeq_1.10.0           aroma.light_2.0.0       matrixStats_0.10.0
 [4] ShortRead_1.22.0        GenomicAlignments_1.0.3 BSgenome_1.32.0
 [7] Rsamtools_1.16.1        Biostrings_2.32.1       XVector_0.4.0
[10] BiocParallel_0.6.1      Biobase_2.24.0          DESeq2_1.4.5
[13] RcppArmadillo_0.4.320.0 Rcpp_0.11.2             GenomicRanges_1.16.3
[16] GenomeInfoDb_1.0.2      IRanges_1.22.10         BiocGenerics_0.10.0
[19] BiocInstaller_1.14.2

loaded via a namespace (and not attached):
 [1] annotate_1.42.1      AnnotationDbi_1.26.0 BatchJobs_1.3
 [4] BBmisc_1.7           bitops_1.0-6         brew_1.0-6
 [7] checkmate_1.2        codetools_0.2-8      DBI_0.2-7
[10] DESeq_1.16.0         digest_0.6.4         fail_1.2
[13] foreach_1.4.2        genefilter_1.46.1    geneplotter_1.42.0
[16] grid_3.1.0           hwriter_1.3          iterators_1.0.7
[19] lattice_0.20-29      latticeExtra_0.6-26  locfit_1.5-9.1
[22] RColorBrewer_1.0-5   R.methodsS3_1.6.1    R.oo_1.18.0
[25] RSQLite_0.11.4       sendmailR_1.1-2      splines_3.1.0
[28] stats4_3.1.0         stringr_0.6.2        survival_2.37-7
[31] tools_3.1.0          XML_3.98-1.1         xtable_1.7-3
[34] zlibbioc_1.10.0

From: Michael Love [michaelisaiahlove at gmail.com]
Sent: Saturday, August 09, 2014 2:11 AM
To: Aditi [guest]
Cc: bioconductor at r-project.org; QAMRA Aditi (GIS)
Subject: Re: False positives due to GC content correction - DESeq2

hi Aditi,

Please include all the code you used for EDAseq and DESeq2, and the

How do you know there are false positive? Are these genes which you
know are not differentially expressed?

Your dispersion plots didn't come through. You can email those
attachments to my email address, and we will continue discussion on
the Bioc list.


On Fri, Aug 8, 2014 at 1:54 PM, Aditi [guest] <guest at bioconductor.org> wrote:
> Hi Mike,
> I have been trying to use DESeq2 for a differential analysis of Chipseq data using 8 T/N pairs. There is a lot of heterogeneity in the samples due to clinical differences ( tumor stage etc), total mapped reads ( some samples are much better than the others), batch effects ( since they were processed at different times and not by the same person). I wanted to correct atleast some of the biases starting with GC content and what I did was to use offsets from EDAseq as an input to DESeq2 and introduced the batch variable in the model.
> What I dont understand is that when I corrected for GC bias in the samples, the final results tend to have a lot of false positives. I have attached the dispersion plots for both the runs. I cant seem to figure why
>  -- output of sessionInfo():
> -
> --
> Sent via the guest posting facility at bioconductor.org.

This e-mail and any attachments are only for the use of the intended recipient and may be confidential and/or privileged. If you are not the recipient, please delete it or notify the sender immediately. Please do not copy or use it for any purpose or disclose the contents to any other person as it may be an offence under the Official Secrets Act.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: EDAseq+DESeq_Script.txt
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140809/04d467b0/attachment.txt>

More information about the Bioconductor mailing list