[BioC] help with limma design,contrast matrix please

James W. MacDonald jmacdon at med.umich.edu
Thu Jan 26 18:17:34 CET 2012


Hi John,

On 1/25/2012 8:21 AM, John Alexander wrote:
> Hi all,
> I'd really appreciate if someone can explain to me whether I've taken the following steps correctly. I'm trying to find the difference in gene expression between (wt vs mutant) for illumina mouse data(MouseWG-6 v2.0 chip). I used a design and contrast matrix (although i'm not sure whether i really need the contrast matrix).
>
> Script :
>
> exprs(BSD.quantile)->matrixExprs
> design<-model.matrix(~-1+factor(c(1,1,1,1,1,2,2,2,2,2,2)))
> colnames(design)<-c("group1","group2")
> fit<-lmFit(matrixExprs,design)
> contrast.matrix<-makeContrasts(group2-group1,levels=design)
> fit2<-contrasts.fit(fit,contrast.matrix)
> fit2<- eBayes(fit2)
> topTable(fit2,coef=1,adjust="BH")
>
>
>> pData(BSL)
> sampleID
> 23 wt 23 wt
> 24 wt 24 wt
> 48 wt 48 wt
> 61 wt 61 wt
> 71 wt 71 wt
> 8-Het 8-Het
> 28-Het 28-Het
> 54-Het 54-Het
> 59-Het 59-Het
> 79-Het 79-Het
> 87-Het 87-Het
>
> I created my design matrix as below (group1 -wt,group2-het):
> group1 group2
> 1 1 0
> 2 1 0
> 3 1 0
> 4 1 0
> 5 1 0
> 6 0 1
> 7 0 1
> 8 0 1
> 9 0 1
> 10 0 1
> 11 0 1
> attr(,"assign")
> [1] 1 1
> attr(,"contrasts")
> attr(,"contrasts")$`factor(c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2))`
> [1] "contr.treatment"
>
> Contrasts
> Levels group2 - group1
> group1 -1
> group2 1
>
>
> I'm hoping this is correct, i'd appreciate if someone could give me a bit more understanding. I'm quite confused reading the limma guide.

You did everything correctly. As you note, you could have done things 
without a contrasts matrix if you had parameterized your model 
differently. As an example,

 > model.matrix(~factor(rep(1:2, each=5)))
    (Intercept) factor(rep(1:2, each = 5))2
1            1                           0
2            1                           0
3            1                           0
4            1                           0
5            1                           0
6            1                           1
7            1                           1
8            1                           1
9            1                           1
10           1                           1

You can figure out what the coefficients in your model are, by noting 
that each  row of the design matrix corresponds to the model for that 
sample.

For instance, in your original design matrix, you had

1    0

in the first row, which corresponds to your WT samples. So this implies 
that the model you are fitting is

WT sample = 1(something) + 0(other thing)

And for row 6-10, it was

0    1

indicating

Het sample = 0(something) + 1(other thing)

Knowing that we are computing means, we can then deduce that 'something' 
== means of WT samples, and 'other thing' == means of Het samples. Thus, 
to make comparisons you have to use a contrast matrix to do the 
subtraction explicitly.

In my example, the first 5 rows are obvious. We are computing the mean 
of the WT samples. However, the rows 6-10 are mysterious. Why two 1s?

We know that the samples for these rows are Het, and the first 
coefficient is WT, so the model is

Het sample = 1(WT) + 1(wazzat)

But simple algebra tells us that this is equivalent to

Het sample - WT = wazzat

Right?

So the second coefficient in my model is explicitly computing the 
difference between Het and WT, which is what you want. And you can get 
it without a contrasts matrix. Just do

fit <- lmFit(matrixExprs, design)
fit2 <- eBayes(fit)

and then the coefficient of interest is coef=2. Try it and see.

Best,

Jim


>
> cheers,
> john
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list