[BioC] Design matrix for simple time course

James W. MacDonald jmacdon at med.umich.edu
Fri Mar 3 19:18:07 CET 2006


Hi Mick,

michael watson (IAH-C) wrote:
> Hi 
> 
> I am trying to create a design matrix for a simple, one-channel
> time-course experiment.
> 
> I have five time points with three replicated arrays at each time point.
> I want to set up the design matrix.
> 
> I tried using:
> 
> 	model.matrix(~factor(rep(1:5,each=3)))
> 
> Vaguely following the tutorial here
> (http://bioinf.wehi.edu.au/marray/jsm2005/lab5/lab5.html)
> 
> However, I only have one factor to model, time.
> 
> The matrix that comes out as the first column all of ones, the
> intercept.  What I (think) I want is the first column to have three 1's
> and the rest 0's.
> 
> I guess I'm really struggling as I don't know what the difference is
> between the output of model.matrix, with an Intercept column of all 1's,
> and the design matrix I want, which has a first column of three 1's at
> the top and the rest 0's.

This is a problem. If you are trying to analyze your data using a 
sophisticated tool like limma but you don't understand the models you 
are fitting, I would venture to say that you are putting the cart before 
the horse. I would strongly recommend either finding a local 
statistician who is willing to sit down with you and explain the 
difference between a cell means and factor effects ANOVA model, or at 
the very least perusing a textbook that covers these topics.

I would recommend something like 'Applied linear statistical models' by 
Neter, Kutner, Nachtsheim and Wasserman, which gives many clear examples 
and is highly approachable.

As a start, here is the basic difference between the two models. In a 
factor effects model (the one with an intercept, given by all 1's in the 
first column), the intercept term represents one time point (in this 
case, the 1st timepoint), and all of the other four terms represent the 
*difference* between the given timepoint and the first (e.g., time2 - 
time1, time3 - time1, etc). In this scenario you might not need a 
contrast matrix if these are the comparisons you are interested in. If 
you want other comparisons then you have to do the algebra to figure out 
the correct contrast matrix.

In a cell means model, you are estimating the mean expression at each 
timepoint, so you have to set up explicit contrasts to do whatever 
comparisons you are interested in. As Ben Bolstad already noted, you fit 
this model by adding a -1 (or a 0) to your call to model.matrix().

HTH,

Jim


> 
> :-s
> 
> Thanks
> Mick
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list