[BioC] edgeR: interactions in linear models

Mon May 13 03:29:20 CEST 2013

Dear Ryan,

The marginality principle is most easily understood in a 2-way factorial 
model.  Suppose you have a 2x2 factorial experiment with two genotypes and 
two treatments (active vs control).

If a two-way interaction exists, then this means that the treatment effect 
is different for the genotypes.  It makes no sense to test for a 
"treatment effect" in this situation (even though mathematical models 
allow you to do so) because there is no consistent treatment effect 
without specifying the genotype.

On the other hand, it always meaningful to test for a treatment effect in 
the two genotypes separately, and then to ask whether the two treatment 
effects are consistent or different.

In a 3-way factorial model, a 3-way interaction means that the experiment 
cannot be reduced to 2-way marginals in any meaningful way.

Best wishes
Gordon

On Mon, 13 May 2013, Gordon K Smyth wrote:

> On Sun, 12 May 2013, Ryan C. Thompson wrote:
>
>> Hi Gordon,
>> 
>> In a previous email on the list you said:
>> 
>>> testing for the 2-way interaction in the presence of a 3-way interaction 
>>> does not make statistical sense.  This is because the parametrization of 
>>> the 2-way interaction as a subset of the 3-way is somewhat arbitrary. 
>>> Before you can test the 2-way interaction species*treatment in a 
>>> meaningful way you would need to accept that the 3-way interaction is not 
>>> necessary and remove it from the model.
>> 
>> Does this mean that it is impossible to test for a 2-way interaction when 
>> your model includes a 3-way interaction term?
>
> It is mathematically possible but has no scientific meaning.  This is called 
> the marginality principle in linear models:
>
> http://en.wikipedia.org/wiki/Principle_of_marginality
>
>> Or does it just mean that the parametrization provided by 
>> "model.matrix(~1+factor1*factor2*factor3)" is such that the 2-way 
>> interaction is not represented by any coefficient, but rather by a complex 
>> contrast?
>
> The same principle applies regardless of the parametrization.
>
>>> I prefer to fit the saturated model (a different level for each treatment 
>>> combination) and make specific contrasts. There is some discussion of this 
>>> in the limma User's Guide.
>
>> If I understand correctly here, you are saying that one can fit a model 
>> where each coefficient represents the abundance for one specific 
>> combination of the 3 factors, as in the Limma User's Guide section 8.5.2 
>> "Analysing as for a Single Factor". In other words, one could do 
>> "model.matrix(~0+factor1:factor2:factor3)" and this would be an alternate 
>> parametrization of the same design. And with this parametrization, all the 
>> 2- and 3-way interaction terms (and simple pairwise comparisons) can easily 
>> be tested from the single full 3-way interaction by specifying the 
>> appropriate contrasts. Do I understand correctly?
>
> Yes, I am recommending the group mean parametrization, as in the limma User's 
> Guide Section 8.5.2 or edgeR User's Guide Section 3.3.1.
>
> I recommend this parametrization because each contrast that is drawn has an 
> explicit meaning in terms of comparisons of groups and can be interpretted on 
> its own terms.
>
> The original poster did what I intended.
>
> Best
> Gordon
>
>> -Ryan Thompson
>> 
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}