[BioC] question about lmFit model

Jenny Drnevich drnevich at illinois.edu
Thu Jan 21 18:05:51 CET 2010

Hi Sabrina,

First, a little list ettiquette. If you don't get a response to a 
post within a day, it's not considered polite to just repost the same 
question verbatim the next day under a different Subject.

Second: your question isn't specific to the modeling of lmFit. 
Instead, it's a general statistical question about why it's better to 
one ANOVA model instead of a series of t-tests. I suggest you consult 
a basic statistical textbook or a local statistician to find the answer.


At 10:39 AM 1/21/2010, sabrina s wrote:
>Hello, everyone:
>I have a question related to conceptual understanding of lmFit.
>I have the following experiment that I want to conduct, but I am not sure
>which is the right way to use design matrix and contrasts. Here is the
>say I have 3 different strains that are genetically different, A, B and C
>where A is the control. I also have two different treatments,
>  T1 and T2. For each strain, I have 4 arrays for each treatment, so in
>total, I have 24 arrays. What I want to find out is the significantly
>differentially expressed genes for the following comparison:
>1) for control strain A:  T1 vs T2
>2) under T1, B vs. A (control)
>3) under T1, C vs. A
>4) for B, T1 vs T2
>5) for C, T1 vs T2
>6) interaction term of A and B , T1 and T2
>7) interaction term of A and C, T1 and T2.
>There are two ways I could use lmFit
>One is:
>for the design matrix, I will include all 3 strains and 2 conditions,
>I use the following code:
>             A_T1, A_T2, B_T1, B_T2, C_T1, C_T2
>sample1:   1      ,0         ,0,        0,      0  ,         0
>sample2 :
>Then make a contrast matrix and follow the code below:
>  fitGene2<-contrasts.fit(fitGene,cont.matrix)
>Instead of using all samples at one time to fit into a lmFit function, I use
>two design matrix only involves A and B, T1 and T2,
>and second design matrix that involves A and C, T1 and T2, and make contrast
>matrix and fit separately. and later on I can compare these two
>results if I want to.
>The question I have is: which one is the right one? For the first method, I
>will have large DOF , and much lower p-values, but it was testing the
>same thing as the second one, so am I creating an artifact? Thanks for
>your help!
>         [[alternative HTML version deleted]]
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>Search the archives: 

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at illinois.edu

More information about the Bioconductor mailing list