[BioC] (Limma) different toptable results for the same dataset using 2 different designs

Mon Dec 5 20:23:30 CET 2005

Dear All,

I am facing a problem that has been discussed previously between Bjorn 
Usadel and Gordon Smyth on the 9-11th Nov 2005.
It is about producing two different topTables using the same initial 
eset, but trying with 2 different design lay outs. However my problem 
seems to be exaxctly the same as the one stated by Bjorn, it isn't fixed 
if I follow Gordon's indications to overcome the "probesets with zero 
variance" problem:
I have 4 chips, 2 biological replicates of condition A and condition B 
on a custom affymetrix array.
R version 2.2.0, 2005-10-06, i386-pc-mingw32
limma "2.2.0"
Biobase "1.8.0"

Here is what I've done:
 > design2 <- cbind(A=1, BvsA=c(0,0,1,1))
 > fit2 <- lmFit(eset, design2)
 > fit2eB <- eBayes(fit2)
 > toptable(fit2eB, n=10)
            M        t      P.Value        B
853  12.40696 143.0466 5.357015e-06 8.649517
974  10.69453 133.1012 5.357015e-06 8.580885
4813 10.49636 129.6253 5.357015e-06 8.553634
4718 11.01015 128.9608 5.357015e-06 8.548209
3587 10.24772 126.1652 5.357015e-06 8.524578
3812 12.35814 126.1029 5.357015e-06 8.524036
2170 11.39107 125.5051 5.357015e-06 8.518801
4265 12.62847 123.7714 5.357015e-06 8.503256
3372 11.47227 123.5476 5.357015e-06 8.501209
345   9.98423 123.4283 5.357015e-06 8.500113

however if I do the following:
 > design <- model.matrix(~ -1+factor(c(1,1,2,2)))
 > colnames(design) <- c("A", "B")
 > fit1 <- lmFit(eset, design)
 > contrast.matrix <- makeContrasts(A-B, levels=design)
 > fit12 <- contrasts.fit(fit1, contrast.matrix)
 > fit1eB <- eBayes(fit12)
 > toptable(fit1eB, n=10)
             M          t   P.Value          B
311  -3.997113 -13.961019 0.5794648 -0.5112598
1327 -1.461801 -11.334987 0.5794648 -0.6889117
113  -1.690073 -10.880814 0.5794648 -0.7308602
4408 -3.066882 -10.019535 0.5794648 -0.8232774
1825 -3.576034  -9.781223 0.5794648 -0.8523026
1224 -1.099785  -9.445264 0.5794648 -0.8961448
289  -2.800736  -9.306995 0.5794648 -0.9152559
288  -1.689312  -8.759499 0.5794648 -0.9977157
2892 -2.513426  -8.675603 0.5794648 -1.0113879
3311  3.005392   8.377018 0.5794648 -1.0625077

Following the instructions given by Gordon, then I looked at:
 > i <- (fit2eB$sigma==0)                               # same result 
with fit1eB
 > sum(i)
[1] 0

so the probe-sets with zero variance doesn't seem to be the reason here...
I, of course, would be tempted to believe the 1st option (giving 
differentially expressed genes with B > 8) but it turns out that 96% of 
the genes are differentially expressed in this 1st option, which is 
quite unlikely!
I can not understand why is it so.
Any  suggestions and/or indication of what I may have done wrong would 
be gratefully appreciated.

All the best,
Celine

-- 
Celine Carret PhD
Pathogen Microarrays group
The Wellcome Trust Sanger Institute
Hinxton, Cambridge CB10 1SA, UK.
tel. +44 (0)1223 834 244 ext.7123
fax. +44 (0)1223 494 919
email: ckc at sanger.ac.uk
http://www.sanger.ac.uk/PostGenomics/PathogenArrays/