[BioC] edgeR: interactions in linear models

Gordon K Smyth smyth at wehi.EDU.AU
Mon May 13 03:08:24 CEST 2013


On Sun, 12 May 2013, Ryan C. Thompson wrote:

> Hi Gordon,
>
> In a previous email on the list you said:
>
>> testing for the 2-way interaction in the presence of a 3-way 
>> interaction does not make statistical sense.  This is because the 
>> parametrization of the 2-way interaction as a subset of the 3-way is 
>> somewhat arbitrary. Before you can test the 2-way interaction 
>> species*treatment in a meaningful way you would need to accept that the 
>> 3-way interaction is not necessary and remove it from the model.
>
> Does this mean that it is impossible to test for a 2-way interaction 
> when your model includes a 3-way interaction term?

It is mathematically possible but has no scientific meaning.  This is 
called the marginality principle in linear models:

  http://en.wikipedia.org/wiki/Principle_of_marginality

> Or does it just mean that the parametrization provided by 
> "model.matrix(~1+factor1*factor2*factor3)" is such that the 2-way 
> interaction is not represented by any coefficient, but rather by a 
> complex contrast?

The same principle applies regardless of the parametrization.

>> I prefer to fit the saturated model (a different level for each treatment 
>> combination) and make specific contrasts. There is some discussion of this 
>> in the limma User's Guide.

> If I understand correctly here, you are saying that one can fit a model 
> where each coefficient represents the abundance for one specific 
> combination of the 3 factors, as in the Limma User's Guide section 8.5.2 
> "Analysing as for a Single Factor". In other words, one could do 
> "model.matrix(~0+factor1:factor2:factor3)" and this would be an 
> alternate parametrization of the same design. And with this 
> parametrization, all the 2- and 3-way interaction terms (and simple 
> pairwise comparisons) can easily be tested from the single full 3-way 
> interaction by specifying the appropriate contrasts. Do I understand 
> correctly?

Yes, I am recommending the group mean parametrization, as in the limma 
User's Guide Section 8.5.2 or edgeR User's Guide Section 3.3.1.

I recommend this parametrization because each contrast that is drawn has 
an explicit meaning in terms of comparisons of groups and can be 
interpretted on its own terms.

The original poster did what I intended.

Best
Gordon

> -Ryan Thompson
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list