[BioC] DESeq2 : Using Normalised ReadCount matrix from EDAseq in DESeq2

Aditi [guest] guest at bioconductor.org
Thu Jun 26 15:16:24 CEST 2014


I wanted to use a normalised read count matrix from EDAseq downstream in DESeq2 analysis. I am not very clear on how to do so from the vignette.

Following are the steps I followed -

## EDAseq - normalising count matrix by GC content

> dataWithin <- withinLaneNormalization(data, "pct_gc", which = "full")
> dataNorm <- betweenLaneNormalization(dataWithin, which = "full")

## I normalised the counts itself instead of generating the offsets as mentioned in the EDAseq vignetter

### DESeq2

> ?? 
> dds <- estimateDispersions(dds)
> dds <- nbinomWaldTest(dds)
> res <- results(dds2)

I dont know how to create a normalization factor matrix. The DESeq2 vignette on the other hand mentions that normalization factors should be on the scale of the counts, like size factors,
and unlike offsets which are typically on the scale of the predictors (i.e. the logarithmic scale for the
negative binomial GLM). 

So in that case should I generate the offset values from EDAseq ie.

> dataWithin <- withinLaneNormalization(data, "pct_gc", which = "full",offset=T)
> dataNorm <- betweenLaneNormalization(dataWithin, which = "full",offset=T)
> EDASeqNormFactors <- exp(-1 * offst(dataNorm))
> normalizationFactors(dds) <- EDASeqNormFactors
> dds <- estimateDispersions(dds)
> dds <- nbinomWaldTest(dds)
> res <- results(dds2)

 -- output of sessionInfo(): 

R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] DESeq2_1.4.5            RcppArmadillo_0.4.300.0 Rcpp_0.11.1            
 [4] EDASeq_1.10.0           aroma.light_2.0.0       matrixStats_0.8.14     
 [7] ShortRead_1.22.0        GenomicAlignments_1.0.1 BSgenome_1.32.0        
[10] Rsamtools_1.16.0        GenomicRanges_1.16.3    GenomeInfoDb_1.0.2     
[13] Biostrings_2.32.0       XVector_0.4.0           IRanges_1.22.7         
[16] BiocParallel_0.6.1      Biobase_2.24.0          BiocGenerics_0.10.0    

loaded via a namespace (and not attached):
 [1] annotate_1.42.0      AnnotationDbi_1.26.0 BatchJobs_1.2       
 [4] BBmisc_1.6           bitops_1.0-6         brew_1.0-6          
 [7] codetools_0.2-8      DBI_0.2-7            DESeq_1.16.0        
[10] digest_0.6.4         fail_1.2             foreach_1.4.2       
[13] genefilter_1.46.1    geneplotter_1.42.0   grid_3.1.0          
[16] hwriter_1.3          iterators_1.0.7      lattice_0.20-29     
[19] latticeExtra_0.6-26  locfit_1.5-9.1       plyr_1.8.1          
[22] RColorBrewer_1.0-5   R.methodsS3_1.6.1    R.oo_1.18.0         
[25] RSQLite_0.11.4       sendmailR_1.1-2      splines_3.1.0       
[28] stats4_3.1.0         stringr_0.6.2        survival_2.37-7     
[31] tools_3.1.0          XML_3.98-1.1         xtable_1.7-3        
[34] zlibbioc_1.10.0     

Sent via the guest posting facility at bioconductor.org.

More information about the Bioconductor mailing list