[BioC] DESeq2: likelihood ratio test
Nhu Quynh T. Tran
nhuquynh.t.tran at gmail.com
Tue Aug 26 17:51:01 CEST 2014
Thank you, Mike.
I have another question. I remember you mentioned that LRT results is similar to the union of contrast Wald tests. I have 80 genes significant by LRT, and 145 genes significant when combining all 3 comparisons. Are these numbers reasonable?
dpsc.dds <- DESeqDataSetFromMatrix(countData = dpsc.counts.data,
colData = dpsc.mapping.data,
design = ~ Gender+Disease)
dpsc.dds.LRT <- DESeq(dpsc.dds,betaPrior=FALSE, test="LRT", full=~Gender+Disease, reduced=~Gender)
dpsc.res.LRT <- results(dpsc.dds.LRT)
dpsc.res <- results(dpsc.dds)
dpsc.res$padj.15qvsCon <- results(dpsc.dds, contrast=c("Disease","15q.Duplication", "Control"))$padj
dpsc.res$padj.15qvsAS <- results(dpsc.dds, contrast=c("Disease","15q.Duplication", "AS.Deletion"))$padj
My session info()
## R version 3.1.0 (2014-04-10)
## Platform: x86_64-apple-darwin10.8.0 (64-bit)
##  en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## attached base packages:
##  parallel stats graphics grDevices utils datasets methods
##  base
## other attached packages:
##  DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2
##  GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.10
##  BiocGenerics_0.10.0
## loaded via a namespace (and not attached):
##  annotate_1.42.1 AnnotationDbi_1.26.0 Biobase_2.24.0
##  DBI_0.2-7 digest_0.6.4 evaluate_0.5.5
##  formatR_0.10 genefilter_1.46.1 geneplotter_1.42.0
##  grid_3.1.0 htmltools_0.2.4 knitr_1.6
##  lattice_0.20-29 locfit_1.5-9.1 RColorBrewer_1.0-5
##  rmarkdown_0.2.49 RSQLite_0.11.4 splines_3.1.0
##  stats4_3.1.0 stringr_0.6.2 survival_2.37-7
##  tools_3.1.0 XML_3.98-1.1 xtable_1.7-3
##  XVector_0.4.0
On Aug 25, 2014, at 12:07 PM, Michael Love wrote:
> hi Quynh,
> It's helpful to include the code and output of sessionInfo() for more
> detailed answers.
> more answers inline:
> On Mon, Aug 25, 2014 at 8:24 AM, Quynh Tran <qtran1 at memphis.edu> wrote:
>> I just want to make sure that I understand the LRT correctly. My purpose is to test if Gender is a confounder in gene expressions across diseases (control, disease1, disease2). I have this code:
>> neu.dds <- DESeqDataSetFromMatrix(countData = neuron.counts.data,
>> colData = neuron.mapping.data,
>> design = ~ Gender+Disease)
>> neu.dds.LRT <- DESeq(neu.dds,betaPrior=FALSE, test="LRT", full=~Gender+Disease, reduced=~Gender)
>> The LRT test the full vs reduced model. So, the null model is reduce and the alternative model is full. When a gene has a p-value < 0.05, we conclude the gene expression changes at some disease in the presence of gender, while others only affected by Gender only???
> Yes, except the null includes genes affected by gender or not at all.
> So we typically say "testing for the effect of disease, controlling for gender"
>> Also, since my disease has 3 levels, I noticed that the p-values for disease2 vs control are the same as for disease1 vs control for the LRT, but not the same for the disease2 vs disease1. Specifically, I have 80 genes with padj< 0.05 for D1 vs Con and D2 vs Con, but have 139 genes with padj <0.05 for D2 vs D1. Why is this the case?
> In the help page for ?results, we have:
> "For analyses using the likelihood ratio test (using nbinomLRT), the
> p-values are determined solely by the difference in deviance between
> the full and reduced model formula. A log2 fold change is included,
> which can be controlled using the name argument, or by default this
> will be the estimated coefficient for the last element of
> It should also specify this information when you show the results
> object if you are using version >= 1.4, e.g.:
> log2 fold change: condition 2 vs 1
> LRT p-value: '~ condition' vs '~ 1'
> If you want to test comparisons of pairs of levels of disease, you
> should be using the Wald test (the standard DESeq() workflow), instead
> of the LRT. The LRT is for testing all levels of disease at once.
> When you use the LRT, using the 'name' argument only changes the
> columns of the results which have to do with the LFC's. For the
> current release of DESeq2 (v1.4), when you use the 'contrast'
> argument, the p-values are replaced with Wald test p-values. In the
> development branch, I am working on having more specific control of
> whether or not to replace the LRT p-values with Wald p-values. For
> now, you can build whatever results table you like by combining the
> LFC columns for all contrasts with the LRT statistic, p-values and
> adjusted p-value columns.
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> Bioconductor mailing list
> Bioconductor at r-project.org
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
[[alternative HTML version deleted]]
More information about the Bioconductor