[BioC] multiple level factors contrast in nbinomLRT, DESeq2.

sh. chunxuan hibergo at outlook.com
Wed Aug 20 12:43:41 CEST 2014

Dear list, 
I am not sure the correct way to interpret the nbinomLRT results in multiple level factors condition.

Here is a toy example. I would to find DE genes after controlling batch effect in the experiments, in which there are multiple levels. 

## make data 
dds <- makeExampleDESeqDataSet(m = 18)
colData <- data.frame(row.names = rownames(colData(dds)), sample = colData(dds)$sample, condition = rep(LETTERS[1:6], each = 3), batch = factor(rep(c(1, 1, 2), 6)))
dds <- DESeqDataSetFromMatrix(counts(dds), colData = colData, design = ~ batch + condition)
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
## LRTest
dds <- nbinomLRT(dds, reduced = formula(~ batch ))
[1] "Intercept"        "batch_2_vs_1"     "condition_B_vs_A" "condition_C_vs_A" "condition_D_vs_A" "condition_E_vs_A" "condition_F_vs_A"
res.1 <- results(dds, name = "condition_B_vs_A")
res.2 <- results(dds, contrast = list("condition_B_vs_A", "batch_2_vs_1")
1.) I would like to get the DE genes between B and A, while controlling for the batch effect. Should I take "res.1" or "res.2", or both are wrong? and what is the correct way to do it?

2.) Why "res.3 <- results(dds, contrast = c("condition", "A", "b"))" gave error: "Error in normalizeSingleBracketSubscript(j, x) :   subscript contains invalid names"

3.) In order to get DE genes between conditions not directly listed in the "resultsNames", is the following codes correct? should "batch_2_vs_1" be included int the contrast?##for example, DE between C and F, controlling for batch effect;res.4 <- results(dds, contrast = list("condition_C_vs_A", "condition_F_vs_A")

> sessionInfo()R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel  graphics  grDevices utils     datasets  stats     grid      methods   base     
other attached packages:
 [1] DESeq2_1.4.5            RcppArmadillo_0.4.320.0 Rcpp_0.11.2             GenomicRanges_1.16.3    GenomeInfoDb_1.0.2      IRanges_1.22.10         BiocGenerics_0.10.0    
 [8] magrittr_1.0.1          dplyr_0.2               RColorBrewer_1.0-5      reshape2_1.4            ggplot2_1.0.0          
loaded via a namespace (and not attached):
 [1] annotate_1.42.1      AnnotationDbi_1.26.0 assertthat_0.1       Biobase_2.24.0       colorspace_1.2-4     DBI_0.2-7            digest_0.6.4         genefilter_1.46.1   
 [9] geneplotter_1.42.0   gtable_0.1.2         lattice_0.20-29      locfit_1.5-9.1       MASS_7.3-33          munsell_0.4.2        plyr_1.8.1           proto_0.3-10        
[17] R.methodsS3_1.6.1    R.oo_1.18.0          RSQLite_0.11.4       scales_0.2.4         splines_3.1.0        stats4_3.1.0         stringr_0.6.2        survival_2.37-7     
[25] tools_3.1.0          XML_3.98-1.1         xtable_1.7-3         XVector_0.4.0       

Best, Chunxuan
PS, sorry for the previous one, I forgot the session, and have no idea why the format is a mess.

	[[alternative HTML version deleted]]

More information about the Bioconductor mailing list