[BioC] A single linear model for all

Arne.Muller at aventis.com Arne.Muller at aventis.com
Wed Apr 28 15:21:36 CEST 2004


A while ago I've posted "programming problem: running many ANOVAs" (I actually got a very sophisticated reply - too sophisticated for me :-( ...). Following this posting I came across another problem with linear models.

I usually run a simple linear model including including all my factors (dose, time, batch) for each probeset on the array. I.e. I construct and run >12,000 linear models and anovas. The model could be:

    Value ~ batch + time, + dose

I was thinking about running just a single linear model that includes the probes( actually the probes sets i.e. the genes)

Value ~ gene + batch + time + dose + probe*batch + probe*time + probe*dose

The gene (probeset) interacts with each main effect.

the actual dataframe would look like this:

Value batch time dose gene
5.225589 NEW 24h 000mM 100001_at
5.207835 NEW 24h 000mM 100001_at
4.138210 NEW 24h 000mM 100001_at
7.253535 OLD 24h 000mM 100001_at
4.018591 PRG 04h 025mM 100001_at
7.205778 PRG 04h 000mM 100001_at
8.191978 NEW 24h 000mM 100002_at

I'm abolutely not sure about this. There are several problems:

1. What  about degrees of freedom, they're huge?
2. Don't know how to interpret summary(fit)
3. Computitionally impossible (on my machine) ;-( ...

I'm more interested in whether anybody here has already tried this seriously, i.e. worked on the statistical theory + biological interpretation.

	kind regards,


Arne Muller, Ph.D.
Toxicogenomics, Aventis Pharma
arne dot muller domain=aventis com

More information about the Bioconductor mailing list