[BioC] DESeq: Hypothesis testing in multifactor design

Yanzhu Lin mlinyzh at gmail.com
Wed Aug 20 15:59:11 CEST 2014


Hi Mike,

I am using DESeq2 package for my project now, and I have some questions
regarding to this pacakge.

Please let me briefly introduce some background information about my
project. I have three factors: A with 16 levels, B with 2 levels and C with
3 levels, in total 16*2*3=96 groups. There are 8 biosamples for each group,
hence 96*8=768 biosamples in total. Due to some issues, we lost some
replicates for some groups, which ends up 726 biosamples, hence it is
unbalance design.

Our purpose are to test the main effects of three factor: A, B and C, the
two-way interaction: A:B, A:C and B:C, and the three-way interaction term:
A:B:C. For example, I will compare full model: ~A+B+C with reduce model:
~A+B to test factor C, and so on for other two main effects. For testing
three-way interaction A:B:C, I will compare full model:
~A+B+C+A:B+A:C+B:C+A:B:C and reduced model ~A+B+C+A:B+A:C+B:C. Then come my
questions.


I have different full models for testing main effects, two-way interaction
and three-way interaction term. Will the dispersion estimation affected by
my full model? Can I specify the full model when I use nbinomLRT()? In
other words, can I use estimateDispersion() only once and fit nbinomLRT
with different full models as below:


dds<-DESeqDataSetFromMatrix(countData=countdata,colData=coldata,design=~A+B+C+A:B+A:C+B:C+A:B:C)

### normalization
dds=estimateSizeFactors(dds)

### dispersion estimation:
dds=estimateDispersions(dds)

###Test three-way interaction term.
dds<-nbinomLRT(dds,reduced=~A+B+C+A:B+A:C+B:C)
###Test main effect of factor A:
dds<-nbinomLRT(dds,full=~A+B+C, reduced=~B+C)
 ###Test main effect of factor B:
dds<-nbinomLRT(dds,full=~A+B+C, reduced=~A+C)

###Test main effect of factor C:
dds<-nbinomLRT(dds,full=~A+B+C, reduced=~A+B)


Thanks,


Yanzhu





On Tue, Jun 10, 2014 at 4:07 PM, Michael Love <michaelisaiahlove at gmail.com>
wrote:

> hi Yanzhu,
>
> Note that we recommend users switch to using DESeq2, which also has
> the likelihood ratio test you are using, and is faster and more
> sensitive.
>
> The pipeline would look like:
>
> DESeq(dds, test="LRT", reduced=~ A+B+C+A:B+A:C+B:C)
>
> for your first example.
>
> For your question, the terms of the reduced model should be contained
> within the full model. Still there are a number of models which
> satisfy this requirement, e.g. for testing B:C, you could use
> A+B+C+A:B+A:C+B:C and A+B+C+A:B+A:C as full and reduced respectively.
> Or you could use A+B+C+B:C and A+B+C. The importance of these other
> interaction terms depends on context, whether they are very
> explanatory or not.
>
> Mike
>
> On Tue, Jun 10, 2014 at 11:21 AM, yanzhu [guest] <guest at bioconductor.org>
> wrote:
> > Dear Community,
> >
> > I have a question about the hypothesis testing of the two-way
> interaction terms in a multifactor design which includes three factors: A,
> B and C.
> >
> > When I tested the three-way interaction I used the full and reduced
> models as below for nbinomGLMTest():
> > Full: count ~ A+B+C+A:B+A:C+B:C+A:B:C
> > Reduced: count ~ A+B+C+A:B+A:C+B:C
> >
> > Now comes my question, when I want to test the effect of two-way
> interaction terms, i.e., A:B, A:C or B:C, what should be my full and
> reduced models? For example, when I want to the test the effect of A:B,
> what should be my full and reduced models for nbinomGLMTest() using DESeq
> pacakge?
> >
> >
> > Best,
> >
> >
> >
> > Yanzhu
> >
> >
> >  -- output of sessionInfo():
> >
> > sessionInfo()
> > R version 3.1.0 (2014-04-10)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> >
> > locale:
> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> States.1252    LC_MONETARY=English_United States.1252
> > [4] LC_NUMERIC=C                           LC_TIME=English_United
> States.1252
> >
> > attached base packages:
> > [1] parallel  stats     graphics  grDevices utils     datasets  methods
>  base
> >
> > other attached packages:
> > [1] DESeq_1.16.0        lattice_0.20-29     locfit_1.5-9.1
> Biobase_2.24.0      BiocGenerics_0.10.0 edgeR_3.6.1         limma_3.20.1
> >
> > loaded via a namespace (and not attached):
> >  [1] annotate_1.42.0      AnnotationDbi_1.26.0 DBI_0.2-7
> genefilter_1.46.0    geneplotter_1.42.0   GenomeInfoDb_1.0.2
> >  [7] grid_3.1.0           IRanges_1.22.6       MASS_7.3-31
> RColorBrewer_1.0-5   RSQLite_0.11.4       splines_3.1.0
> > [13] stats4_3.1.0         survival_2.37-7      tools_3.1.0
> XML_3.98-1.1         xtable_1.7-3
> >
> >
> > --
> > Sent via the guest posting facility at bioconductor.org.
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list