[BioC] limma design matrix

Walter Glaser walter.glaser at univie.ac.at
Thu Jan 5 11:59:37 CET 2012


Hi Assa,

For myself, I find formulating the design matrix like this to be very 
accessible:

# define the factors with names (extended from James version)
condition <- factor(rep(c("wt","LiCl"), each = 3, times = 3))
group <- factor(rep(c("total","low","high"), each = 6))
# concatenate factor combinations together
combinations <- paste(condition, group, sep=".")
# create factors from combinations
f <- factor(combinations, levels=unique(combinations))
design <- model.matrix(~0+f)
# beautify column names
colnames(design) <- levels(f)
design
    wt.total LiCl.total wt.low LiCl.low wt.high LiCl.high
1         1          0      0        0       0         0
2         1          0      0        0       0         0
3         1          0      0        0       0         0
4         0          1      0        0       0         0
5         0          1      0        0       0         0
6         0          1      0        0       0         0
7         0          0      1        0       0         0
8         0          0      1        0       0         0
9         0          0      1        0       0         0
10        0          0      0        1       0         0
11        0          0      0        1       0         0
12        0          0      0        1       0         0
13        0          0      0        0       1         0
14        0          0      0        0       1         0
15        0          0      0        0       1         0
16        0          0      0        0       0         1
17        0          0      0        0       0         1
18        0          0      0        0       0         1

This gives you a matrix where every factor combination has a separate 
column.
After you create the fit you can easily create your contrasts of 
interest using
the column names, e.g. for the comparison of the total:

cont.matrix <- makeContrasts(WTvsLiClTotal=wt.total-LiCl.total, 
levels=design)
fit2 <- contrasts.fit(fit, cont.matrix)

For your case I would suggest reading the excellent Limma userguide:
"Chapter 8.7 Factorial Designs" which explains the the different design 
matrices and upon which my example is based on.

Hope that helps.

Cheers,
Walter

On 03.01.2012 16:47, Assa Yeroslaviz wrote:
> Hi James,
>
> Thanks for the answer.
> I like the first idea of having the design matrix more flexible. The
> problem I have with understanding it, regards the different columns.
> How can I compare two conditions using this matrix.
> If I understand it correctly, I have the wt in the first column and the and
> the LiCl in the second one. The third column contains the 'low' and the
> fourth the 'high' files.
>
> First question - what about the 'total' files - how can I get to them with
> this matrix?
>
> Second question - how can I for example can compare the wt.total with the
> LiCl.total files (or the other two groups)?
>
> Thanks
>
> Assa
>
> On Tue, Jan 3, 2012 at 16:08, MacDonald, James<jmacdon at med.umich.edu>wrote:
>
>> Hi Assa,
>>
>>
>> On 1/3/12 7:32 AM, Assa Yeroslaviz wrote:
>>
>>> Hi bioC User,
>>>
>>> I have a set of 18 Affymetrix arrays with three different conditions. The
>>> annotation looks like that:
>>> number    condition    group
>>> 19    wt    total
>>> 20    wt    total
>>> 21    wt    total
>>> 22    LiCl    total
>>> 23    LiCl    total
>>> 24    LiCl    total
>>> 25    wt    low
>>> 26    wt    low
>>> 27    wt    low
>>> 28    LiCl    low
>>> 29    LiCl    low
>>> 30    LiCl    low
>>> 31    wt    high
>>> 32    wt    high
>>> 33    wt    high
>>> 34    LiCl    high
>>> 35    LiCl    high
>>> 36    LiCl    high
>>>
>>> for each of the groups total, low and high I would like to compare wt vs.
>>> LiCl (treatment).
>>> I was trying to create a design matrix for the complete data set, but
>>> somehow I am mixing everything up.
>>>
>>> Can someone please help me with the with the correct construction of the
>>> design matrix.
>>>
>>>
>> The easiest way is to use the model.matrix() function.
>>
>> What I would normally do is this:
>>
>> condition<- factor(rep(1:2, each = 3, times = 3))
>> group<- factor(rep(1:3, each = 6))
>>
>> design<- model.matrix(~ 0 + condition + group)
>>
>> Because that will allow more comparisons than what you want. However, it
>> is easy enough to get what you want:
>>
>>
>> design<- model.matrix(~0 + factor(rep(1:6, each = 3)))
>>
>> colnames(design)<- paste(rep(c("wt","LiCl"), 3),
>> rep(c("total","low","high"), each = 2), sep = ".")
>>
>> You will note that the design matrix you get is not the same as what you
>> have below, because you cannot have a column of zeros.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>   If I understand it correctly it should looks like this:
>>> 1    0    0    0    0    0    0
>>> 1    0    0    0    0    0    0
>>> 1    0    0    0    0    0    0
>>> 0    1    0    0    0    0    0
>>> 0    1    0    0    0    0    0
>>> 0    1    0    0    0    0    0
>>> 0    0    1    0    0    0    0
>>> 0    0    1    0    0    0    0
>>> 0    0    1    0    0    0    0
>>> 0    0    0    1    0    0    0
>>> 0    0    0    1    0    0    0
>>> 0    0    0    1    0    0    0
>>> 0    0    0    0    1    0    0
>>> 0    0    0    0    1    0    0
>>> 0    0    0    0    1    0    0
>>> 0    0    0    0    0    1    0
>>> 0    0    0    0    0    1    0
>>> 0    0    0    0    0    1    0
>>>
>>> But somehow I can neither create it nor write the right header.
>>> I would appreciate your help
>>>
>>> Thanks
>>> Assa
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________**_________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>> Search the archives: http://news.gmane.org/gmane.**
>>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>
>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>>
>> ************************************************************
>> Electronic Mail is not secure, may not be read every day, and should not
>> be used for urgent or sensitive issues
>>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
-------------------------------------------------------------
Dr. Walter Glaser

Max F. Perutz Laboratories
Department of Medical Biochemistry
Medical University of Vienna
Dr. Bohrg. 9/2
1030 Vienna
AUSTRIA



More information about the Bioconductor mailing list