[BioC] Finding genes with significant linear regression or trend

Sean Davis sdavis2 at mail.nih.gov
Mon May 19 12:56:47 CEST 2008


On Mon, May 19, 2008 at 6:39 AM, Daniel Brewer <daniel.brewer at icr.ac.uk> wrote:
> What is the best way in bioconductor to find genes that have a
> significant trend with a continuous variable e.g. concentration or time.
>  This would be using microarray data and trying to find genes that show
> a dose response or a time response.  In the simplest of cases this would
> be a linear regression.  For example I have an experiment with time
> points 24,48,72,96 and I would like to find genes who have expression
> that increases with time i.e. expression is greater in each of the time
> points.
>
> I have looked into trying to do this with limma but the user manual only
> seems to deal with time courses with each time being a factor rather
> than a continuous variable.

Limma will deal with continuous variables just fine.  Just change the
value of the factor to a number, if you have continuous data.

genes <- matrix(rnorm(100),nc=10)
var1 <- rnorm(10)
df <- data.frame(var1=rnorm(10))
dm <- model.matrix(~ var1, data=df)
fit1 <- lmFit(genes,dm)
fit2 <- eBayes(fit1)
topTable(fit2,coef=2)

However, keep in mind the hypothesis you will be testing--that the
gene expression changes are linearly correlated with the variable.
While some genes may show this effect, there are probably plenty of
other important and interesting genes that will not fit this model.
The same reasoning holds for the dose-response relationship; if you
are lucky enough (or smart enough) to be on the linear portion of the
dose response curve for one gene, you may be very far away from linear
for another gene.

So, to summarize, be sure that linearity is the appropriate model
before applying it; in biology, it might very well not be the correct
model for all genes.

Sean



More information about the Bioconductor mailing list