[BioC] DESeq2 : Using Normalised ReadCount matrix from EDAseq in DESeq2
Aditi [guest]
guest at bioconductor.org
Thu Jun 26 15:16:24 CEST 2014
Hi,
I wanted to use a normalised read count matrix from EDAseq downstream in DESeq2 analysis. I am not very clear on how to do so from the vignette.
Following are the steps I followed -
## EDAseq - normalising count matrix by GC content
> dataWithin <- withinLaneNormalization(data, "pct_gc", which = "full")
> dataNorm <- betweenLaneNormalization(dataWithin, which = "full")
## I normalised the counts itself instead of generating the offsets as mentioned in the EDAseq vignetter
### DESeq2
> ??
> dds <- estimateDispersions(dds)
> dds <- nbinomWaldTest(dds)
> res <- results(dds2)
I dont know how to create a normalization factor matrix. The DESeq2 vignette on the other hand mentions that normalization factors should be on the scale of the counts, like size factors,
and unlike oï¬sets which are typically on the scale of the predictors (i.e. the logarithmic scale for the
negative binomial GLM).
So in that case should I generate the offset values from EDAseq ie.
> dataWithin <- withinLaneNormalization(data, "pct_gc", which = "full",offset=T)
> dataNorm <- betweenLaneNormalization(dataWithin, which = "full",offset=T)
> EDASeqNormFactors <- exp(-1 * offst(dataNorm))
> normalizationFactors(dds) <- EDASeqNormFactors
> dds <- estimateDispersions(dds)
> dds <- nbinomWaldTest(dds)
> res <- results(dds2)
-- output of sessionInfo():
R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1
[4] EDASeq_1.10.0 aroma.light_2.0.0 matrixStats_0.8.14
[7] ShortRead_1.22.0 GenomicAlignments_1.0.1 BSgenome_1.32.0
[10] Rsamtools_1.16.0 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2
[13] Biostrings_2.32.0 XVector_0.4.0 IRanges_1.22.7
[16] BiocParallel_0.6.1 Biobase_2.24.0 BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] annotate_1.42.0 AnnotationDbi_1.26.0 BatchJobs_1.2
[4] BBmisc_1.6 bitops_1.0-6 brew_1.0-6
[7] codetools_0.2-8 DBI_0.2-7 DESeq_1.16.0
[10] digest_0.6.4 fail_1.2 foreach_1.4.2
[13] genefilter_1.46.1 geneplotter_1.42.0 grid_3.1.0
[16] hwriter_1.3 iterators_1.0.7 lattice_0.20-29
[19] latticeExtra_0.6-26 locfit_1.5-9.1 plyr_1.8.1
[22] RColorBrewer_1.0-5 R.methodsS3_1.6.1 R.oo_1.18.0
[25] RSQLite_0.11.4 sendmailR_1.1-2 splines_3.1.0
[28] stats4_3.1.0 stringr_0.6.2 survival_2.37-7
[31] tools_3.1.0 XML_3.98-1.1 xtable_1.7-3
[34] zlibbioc_1.10.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list