[BioC] EdgeR multi-factor testing question

Gordon K Smyth smyth at wehi.EDU.AU
Thu Jan 9 23:34:51 CET 2014


Dear Yanzhu,

Yes, that's how I would do it.  Keep the same dispersions for all fits.

Best wishes
Gordon

> Date: Wed,  8 Jan 2014 06:36:16 -0800 (PST)
> From: "Yanzhu [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, mlinyzh at gmail.com
> Subject: [BioC] EdgeR multi-factor testing question
>
> Dear Gordon,
>
> I have one more question about the estimation of dispersion.
>
> When the three-way interaction term is insignificant, I will fit the 
> model 2 without the three-way interaction to test the two-way 
> interaction terms. When all interaction terms are insignificant, I fit 
> the additive model (model 3) to test the main effect. Could I use the 
> same dispersion for all the models, i.e., model 1 (including 
> everything), model 2 (without three-way interaction term) and model 3 
> (additive model)? Could this dispersion be estimated under design of 
> model 1?
>
> Thank you!
> Yanzhu
>
> ---------------------------------------------------------
>
> Dear Yanzhu,
>
> Your analysis is fine from a code point of view.  From a statistical point
> of view however your analysis is too simple because you are neglecting the
> principle of marginality:
>
>   http://en.wikipedia.org/wiki/Principle_of_marginality
>
> For the model you have fitted, it makes sense to test for the three-way
> interaction as you do.  However it does not make statistical sense to test
> for the main effects or two-interactions until you have established that
> the three-way interaction is non-significant.
>
> For count data, the tests for the lower-level interactions need to be
> computed by successively removing each level of interactions from the
> model.  See for example:
>
>   https://stat.ethz.ch/pipermail/bioconductor/2013-December/056584.html
>
> This is the same as the anova() function does in R for unbalanced linear
> factorial models.
>
> Furthermore, testing the two-way interations is only sensible for genes
> with non-signicant 3-way interactions.  Similarly, testing the main effect
> is only sensible for genes with non-significant 2-way and 3-way
> interactions.  Otherwise these tests have no useful scientific meaning.
>
> This is a basic drawback of the factorial anova approach.  You might
> consider the alternative approach described in Section 3.3.1 of the edgeR
> User's Guide.
>
> Best wishes
> Gordon
>
>
>
>
> -- output of sessionInfo():
>
>>  sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] edgeR_3.2.4  limma_3.16.8
>
> loaded via a namespace (and not attached):
> [1] tools_3.0.1
>
>
> --
> Sent via the guest posting facility at bioconductor.org.

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list