[BioC] multiple level factors contrast in nbinomLRT, DESeq2.
sh. chunxuan
hibergo at outlook.com
Wed Aug 20 12:43:41 CEST 2014
Dear list,
I am not sure the correct way to interpret the nbinomLRT results in multiple level factors condition.
Here is a toy example. I would to find DE genes after controlling batch effect in the experiments, in which there are multiple levels.
## make data
library(DESeq2)
dds <- makeExampleDESeqDataSet(m = 18)
colData <- data.frame(row.names = rownames(colData(dds)), sample = colData(dds)$sample, condition = rep(LETTERS[1:6], each = 3), batch = factor(rep(c(1, 1, 2), 6)))
dds <- DESeqDataSetFromMatrix(counts(dds), colData = colData, design = ~ batch + condition)
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
## LRTest
dds <- nbinomLRT(dds, reduced = formula(~ batch ))
resultsNames(dds)
[1] "Intercept" "batch_2_vs_1" "condition_B_vs_A" "condition_C_vs_A" "condition_D_vs_A" "condition_E_vs_A" "condition_F_vs_A"
res.1 <- results(dds, name = "condition_B_vs_A")
res.2 <- results(dds, contrast = list("condition_B_vs_A", "batch_2_vs_1")
Questions:
1.) I would like to get the DE genes between B and A, while controlling for the batch effect. Should I take "res.1" or "res.2", or both are wrong? and what is the correct way to do it?
2.) Why "res.3 <- results(dds, contrast = c("condition", "A", "b"))" gave error: "Error in normalizeSingleBracketSubscript(j, x) : subscript contains invalid names"
3.) In order to get DE genes between conditions not directly listed in the "resultsNames", is the following codes correct? should "batch_2_vs_1" be included int the contrast?##for example, DE between C and F, controlling for batch effect;res.4 <- results(dds, contrast = list("condition_C_vs_A", "condition_F_vs_A")
> sessionInfo()R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel graphics grDevices utils datasets stats grid methods base
other attached packages:
[1] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.10 BiocGenerics_0.10.0
[8] magrittr_1.0.1 dplyr_0.2 RColorBrewer_1.0-5 reshape2_1.4 ggplot2_1.0.0
loaded via a namespace (and not attached):
[1] annotate_1.42.1 AnnotationDbi_1.26.0 assertthat_0.1 Biobase_2.24.0 colorspace_1.2-4 DBI_0.2-7 digest_0.6.4 genefilter_1.46.1
[9] geneplotter_1.42.0 gtable_0.1.2 lattice_0.20-29 locfit_1.5-9.1 MASS_7.3-33 munsell_0.4.2 plyr_1.8.1 proto_0.3-10
[17] R.methodsS3_1.6.1 R.oo_1.18.0 RSQLite_0.11.4 scales_0.2.4 splines_3.1.0 stats4_3.1.0 stringr_0.6.2 survival_2.37-7
[25] tools_3.1.0 XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0
Best, Chunxuan
PS, sorry for the previous one, I forgot the session, and have no idea why the format is a mess.
[[alternative HTML version deleted]]
More information about the Bioconductor
mailing list