[BioC] F-tests for factorial effects - limma

Gordon Smyth smyth at wehi.edu.au
Fri Jan 7 02:12:52 CET 2005


Why don't I find main effects of interest? Let's suppose you're considering 
genotype*treatment and the genotypes are wt and ko (knockout). My 
experience is that an experimenter wants to know which genes respond to the 
treatment in the wt, which genes respond in the ko, and which respond 
differently between the two genotypes (the interaction). So the experiment 
wants two separate lists and want to compare and constrast these lists. But 
the factorial main effect for the treatment refers to the *average* 
response for a each gene averaged over wt and ko. So it is not a real 
effect for any given genotype. I've never seen a microarray context where 
the the experimenter wanted to know responses averaged over different and 
essentially different genotypes. Of course one may want an effect averaged 
over different cell lines or individuals from the same genotype, but not 
over different genotypes which are essentially different. That's my take on 
it. I would be interested in different opinions. I've written briefly on 
this in the Factorial Design section on page 30 of the User's Guide at 
http://bioinf.wehi.edu.au/limma/usersguide.pdf

The reason that main effects are computationally different to interactions 
is that I have to treat all microarray experiments as unbalanced designs. I 
have to do this because I have to allow for missing values or spot quality 
weights in the case of two-colour data and for weights arising from robust 
estimation or affyPLM standard errors in the case of affy data. This means 
that the anova break-down is a sequential ANOVA table with first main 
effect adjusted for intercept, second main effect adjusted for the first, 
and interaction adjusted for all main effects. In the case of the 
interaction one can just fit the full model and then test that the 
coefficients are zero. With main effects one can't do this because that 
would be equivalent to adjusting the main effects for the interaction which 
is not what is wanted. This means that to get the main effect SS you 
actually have to fit three or four different models to get the sequential 
SS. Of course you could othogonalize all the coefficients to solve the 
problem, but this would have to be done on a gene-wise basis to allow for 
the possibility of gene-specific weights.

Gordon

At 01:14 AM 23/12/2004, Naomi Altman wrote:
>Thanks, Gordon.  We will try contrasts.fit.  If that does not give us what 
>we want, I will consider writing an aov.series routine.
>
>The microarray experiments I see are often factorial, RCB, or split plot 
>designs.  Particularly with Affy arrays, they are often balanced.  I guess 
>I am used to thinking in terms of "effects" rather than individual 
>contrasts, but why would the main effect and interaction idea be different 
>for microarray experiments than for other experiments?
>
>e.g. treatment 1-3
>       genotype 1-3
>
>The interaction is always of interest.  However, even in the absence of 
>interaction, the genotype or treatment effect is often of interest.  e.g. 
>the genotype may be various knockouts - we are interested in knowing what 
>other genes have been induced or repressed.  Why is the test for main 
>effects for a particular gene less interesting in this context than e.g. 
>the change in weight or other univariate measure?
>
>
>--Naomi
>
>At 12:23 AM 12/23/2004 +1100, Gordon K Smyth wrote:
>> > Date: Tue, 21 Dec 2004 17:21:19 -0500
>> > From: Naomi Altman <naomi at stat.psu.edu>
>> > Subject: [BioC] F-tests for factorial effects - limma
>> > To: bioconductor at stat.math.ethz.ch
>> >
>> > I am analyzing a 2-factor factorial Affy experiment, with 3 d.f. for each
>> > factor.
>> >
>> > I would like to get the F-tests for the main effects and interactions 
>> using
>> > limma.
>> >
>> > I have computed all the contrasts, and got the t-tests (both 
>> unadjusted and
>> > eBayes).  I do know how to combine these into F-tests "by hand" but I 
>> could
>> > not figure out if there was a simple way to do this using limma.
>>
>>limma doesn't have any easy way to deal with main effects and 
>>interactions, at least not with main
>>effects, interactions are actually simpler.  I haven't implemented this 
>>because I've never been
>>able to figure out what one would do with these things in a microarray 
>>context.
>>
>>To compute F-tests for main effects and interaction, the easiest way 
>>would probably be to compute
>>the SS for main effects and interactions by non-limma means, then use 
>>shrinkVar() to adjust the
>>residual mean squares, i.e., the F-statistic denominators.
>>
>>If you only want F-tests for interactions, the following code would work:
>>
>>X <- model.matrix(~a*b)
>>fit <- lmFit(eset, X)
>>p <- ncol(X)
>>cont.ia <- diag(p)[,attr(X,"assign")==3]
>>fit.ia <- eBayes(contrasts.fit(fit, cont.ia))
>>
>>Now fit.ia contains the F-statistic and p-values for the interaction in 
>>fit.ia$F and
>>fit.ia$F.p.value.
>>
>> > I had a look at FStat (classifyTestsF).  There seems to be a problem, in
>> > that the matrix tstat is not premultiplied by the contrast matrix when the
>> > F-statistics are computed.  So, if the contrasts are not full-rank, an
>> > error is generated (instead of the F-statistics) because nrow(Q) !=
>> > ncol(tstat)..  (See the lines below).
>>
>>No, the code is correct.  FStat is quite happy with non full rank 
>>contrasts but the contrast
>>matrix must be applied using contrasts.fit() before entering 
>>FStat().  You should not expect to
>>see a contrast matrix inside the classifyTestsF() code.
>>
>>Gordon
>>
>> > if (fstat.only) {
>> >          fstat <- drop((tstat%*% Q)^2 %*% array(1, c(r, 1)))
>> >          attr(fstat, "df1") <- r
>> >          attr(fstat, "df2") <- df
>> >          return(fstat)
>> >      }
>> >
>> > I figured that before I fiddled with the code, I would check to make sure
>> > that I didn't miss an existing routine to do this.
>> >
>> > Thanks in advance.
>> >
>> > Naomi S. Altman                                814-865-3791 (voice)
>> > Associate Professor
>> > Bioinformatics Consulting Center
>> > Dept. of Statistics                              814-863-7114 (fax)
>> > Penn State University                         814-865-1348 (Statistics)
>> > University Park, PA 16802-2111



More information about the Bioconductor mailing list