[BioC] help needed - 7 array time course gene expression no replicates - fold change calculation

James W. MacDonald jmacdon at uw.edu
Tue Jul 10 16:08:28 CEST 2012


Hi Liat,

On 7/10/2012 7:09 AM, Liat [guest] wrote:
> Dear All,
>
> I am new to both bioconductor and microarrays and am struggling quite a bit.
>
> I am trying to analyze data collected at 7 time points for just 1 treatment (so basically at time 0 something was added to the cells and we want to know how expression changed along time). There are no replicates.
>
> Not having replicates seems to cause quite a lot of problems. I assume I'm just not using the right packages/calls.

It depends on what assumptions you are willing to make. If you want to 
assume that there is a linear response between time and gene expression 
(where time is considered to be a continuous covariate rather than a 
factor level), then you can fit a model using these data. You could even 
allow for curvature in the line by adding a quadratic or cubic term. In 
that situation you could still use limma.

However, if you are simply looking to find differences between say time 
1 and time 0, then you have no replication and will have to rely only on 
fold change. This is a simple matter of subtracting one column of your 
data matrix from another (assuming that you have taken logs, which you 
should do).

>
> I tried using the limma package, but my design matrix is actually a vector (as I only have one treatment) and that doesn't seem to work.

The design matrix will not be a single vector regardless. You are not 
comparing treatment, as you have only one treatment. You are comparing 
time, for which you have seven observations. Let's say you used 0, and 
1-6 hours as time points. You could use a design matrix like

 > time <- seq(0,6,1)
 > model.matrix(~time)
   (Intercept) time
1           1    0
2           1    1
3           1    2
4           1    3
5           1    4
6           1    5
7           1    6
attr(,"assign")
[1] 0 1

Where obvs, the first column is the intercept and the second column is 
the time as continuous covariate. You could also add a quadratic term

 > model.matrix(~time+I(time^2))
   (Intercept) time I(time^2)
1           1    0         0
2           1    1         1
3           1    2         4
4           1    3         9
5           1    4        16
6           1    5        25
7           1    6        36


where you are allowing for curvature in one direction. You might want to 
add a cubic term as well, which will allow for two curves, but you are 
really running out of degrees of freedom at that point.

There are other things you could do as well. You could look for big 
changes between time points (and relatively unchanged expression at all 
other times) by aggregating time points. As an example, a given gene 
might be relatively unchanged at the first three time points, then jump 
up to a higher expression level and remain there for the remaining four 
time points. A t-test comparing the mean of the first three points and 
the remaining four time points would tease that out. You could do 
several such comparisons (time  0 vs all others, time 0 and 1 vs times 
2-6, etc).

Again, there are underlying assumptions for this sort of analysis, and 
you are looking for a very particular pattern. It really comes down to 
what sort of assumptions you are willing to make, and whether or not you 
will be able to defend those assumptions to others.

Best,

Jim



>
> I would like to consider genes that show a minimum of two-fold change in expression. So (I think - again, I'm a complete newbie) I need to compare each of time points 1-6 to time point 0 and look at the difference in expression levels.
>
> How can I do that?
>
> Your help will be greatly appreciated!
> Liat.
>
>   -- output of sessionInfo():
>
>
>
>
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list