[BioC] DESeq: Hypothesis testing in multifactor design

Michael Love michaelisaiahlove at gmail.com
Wed Aug 20 18:44:32 CEST 2014


Hi Yanzhu,

On Aug 20, 2014 3:59 PM, "Yanzhu Lin" <mlinyzh at gmail.com> wrote:
>
> Hi Mike,
>
> I am using DESeq2 package for my project now, and I have some questions
regarding to this pacakge.
>
> Please let me briefly introduce some background information about my
project. I have three factors: A with 16 levels, B with 2 levels and C with
3 levels, in total 16*2*3=96 groups. There are 8 biosamples for each group,
hence 96*8=768 biosamples in total. Due to some issues, we lost some
replicates for some groups, which ends up 726 biosamples, hence it is
unbalance design.
>
> Our purpose are to test the main effects of three factor: A, B and C, the
two-way interaction: A:B, A:C and B:C, and the three-way interaction term:
A:B:C. For example, I will compare full model: ~A+B+C with reduce model:
~A+B to test factor C, and so on for other two main effects. For testing
three-way interaction A:B:C, I will compare full model:
~A+B+C+A:B+A:C+B:C+A:B:C and reduced model ~A+B+C+A:B+A:C+B:C. Then come my
questions.
>
>
> I have different full models for testing main effects, two-way
interaction and three-way interaction term. Will the dispersion estimation
affected by my full model? Can I specify the full model when I use
nbinomLRT()?

No, you should not change the design in between, i.e. don't use a different
design for dispersion estimation and the full model in the LRT.

Mike

> In other words, can I use estimateDispersion() only once and fit
nbinomLRT with different full models as below:
>
>
>
dds<-DESeqDataSetFromMatrix(countData=countdata,colData=coldata,design=~A+B+C+A:B+A:C+B:C+A:B:C)
>
> ### normalization
> dds=estimateSizeFactors(dds)
>
> ### dispersion estimation:
> dds=estimateDispersions(dds)
>
> ###Test three-way interaction term.
> dds<-nbinomLRT(dds,reduced=~A+B+C+A:B+A:C+B:C)
> ###Test main effect of factor A:
> dds<-nbinomLRT(dds,full=~A+B+C, reduced=~B+C)
> ###Test main effect of factor B:
> dds<-nbinomLRT(dds,full=~A+B+C, reduced=~A+C)
>
> ###Test main effect of factor C:
> dds<-nbinomLRT(dds,full=~A+B+C, reduced=~A+B)
>
>
> Thanks,
>
>
> Yanzhu
>
>
>
>
>
> On Tue, Jun 10, 2014 at 4:07 PM, Michael Love <michaelisaiahlove at gmail.com>
wrote:
>>
>> hi Yanzhu,
>>
>> Note that we recommend users switch to using DESeq2, which also has
>> the likelihood ratio test you are using, and is faster and more
>> sensitive.
>>
>> The pipeline would look like:
>>
>> DESeq(dds, test="LRT", reduced=~ A+B+C+A:B+A:C+B:C)
>>
>> for your first example.
>>
>> For your question, the terms of the reduced model should be contained
>> within the full model. Still there are a number of models which
>> satisfy this requirement, e.g. for testing B:C, you could use
>> A+B+C+A:B+A:C+B:C and A+B+C+A:B+A:C as full and reduced respectively.
>> Or you could use A+B+C+B:C and A+B+C. The importance of these other
>> interaction terms depends on context, whether they are very
>> explanatory or not.
>>
>> Mike
>>
>> On Tue, Jun 10, 2014 at 11:21 AM, yanzhu [guest] <guest at bioconductor.org>
wrote:
>> > Dear Community,
>> >
>> > I have a question about the hypothesis testing of the two-way
interaction terms in a multifactor design which includes three factors: A,
B and C.
>> >
>> > When I tested the three-way interaction I used the full and reduced
models as below for nbinomGLMTest():
>> > Full: count ~ A+B+C+A:B+A:C+B:C+A:B:C
>> > Reduced: count ~ A+B+C+A:B+A:C+B:C
>> >
>> > Now comes my question, when I want to test the effect of two-way
interaction terms, i.e., A:B, A:C or B:C, what should be my full and
reduced models? For example, when I want to the test the effect of A:B,
what should be my full and reduced models for nbinomGLMTest() using DESeq
pacakge?
>> >
>> >
>> > Best,
>> >
>> >
>> >
>> > Yanzhu
>> >
>> >
>> >  -- output of sessionInfo():
>> >
>> > sessionInfo()
>> > R version 3.1.0 (2014-04-10)
>> > Platform: x86_64-w64-mingw32/x64 (64-bit)
>> >
>> > locale:
>> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252    LC_MONETARY=English_United States.1252
>> > [4] LC_NUMERIC=C                           LC_TIME=English_United
States.1252
>> >
>> > attached base packages:
>> > [1] parallel  stats     graphics  grDevices utils     datasets
methods   base
>> >
>> > other attached packages:
>> > [1] DESeq_1.16.0        lattice_0.20-29     locfit_1.5-9.1
Biobase_2.24.0      BiocGenerics_0.10.0 edgeR_3.6.1         limma_3.20.1
>> >
>> > loaded via a namespace (and not attached):
>> >  [1] annotate_1.42.0      AnnotationDbi_1.26.0 DBI_0.2-7
genefilter_1.46.0    geneplotter_1.42.0   GenomeInfoDb_1.0.2
>> >  [7] grid_3.1.0           IRanges_1.22.6       MASS_7.3-31
RColorBrewer_1.0-5   RSQLite_0.11.4       splines_3.1.0
>> > [13] stats4_3.1.0         survival_2.37-7      tools_3.1.0
XML_3.98-1.1         xtable_1.7-3
>> >
>> >
>> > --
>> > Sent via the guest posting facility at bioconductor.org.
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list