[BioC] LIMMA lmscFit: difference between linear models

Wed Sep 28 01:15:03 CEST 2005

Hi Koen,

> But the thing is, I think that the technical variation is much smaller
> than the biological variation to begin with (high quality Agilent arrays
> vs human samples)! For example, the distance between all pat1 channels in
> a hierarchical clustering is much smaller than between individuals.
> Doesn't that imply that "removing" the technical variance shouldn't have
> such a big impact on my analysis results? So is the linear model doing
> something else that I'm missing here?

Why would you want to remove it? And would you be planning on doing
that? As far as I can see, it would be more work for you and screw up
your analysis. Even if it doesn't have a big impact, it would still have
an impact.

> Second, and this is something that is more related to some limitations in
> the LIMMA package, the technical replicates are now treated as biological
> replicates. So probably the number of significant genes is overestimated.
> But to what extend? Is there a way the estimate this overestimation? Or
> shoul I let the qPCR validation do the talking?

I do not think that is correct.

But your second model takes both the technical (in the original fit) and
the biological replicates (in the contrast) into account. I wouldn't
think that it would overestimate the number of genes because of that. 

You would probably want to look at the false discovery rate though to
make sure you're correcting for that.

Of couse, you will want to validate those results anyway.

> Does anybody have any comments/suggestions on the best method to detect
> differential gene expression in an experiment with low technical variance
> and high biological variance?

We're in about the same situation as you are and we're quite happy with
Limma.

Francois