[BioC] (Limma) different toptable results for the same dataset using 2 different designs

Tue Dec 6 12:26:56 CET 2005

> Date: Mon, 05 Dec 2005 19:23:30 +0000
> From: Celine Carret <ckc at sanger.ac.uk>
> Subject: [BioC] (Limma) different toptable results for the same
> 	dataset using 2 different designs
> To: bioconductor at stat.math.ethz.ch
>
> Dear All,
>
> I am facing a problem that has been discussed previously between Bjorn
> Usadel and Gordon Smyth on the 9-11th Nov 2005.
> It is about producing two different topTables using the same initial
> eset, but trying with 2 different design lay outs. However my problem
> seems to be exaxctly the same as the one stated by Bjorn, it isn't fixed
> if I follow Gordon's indications to overcome the "probesets with zero
> variance" problem:
> I have 4 chips, 2 biological replicates of condition A and condition B
> on a custom affymetrix array.
> R version 2.2.0, 2005-10-06, i386-pc-mingw32
> limma "2.2.0"
> Biobase "1.8.0"
>
> Here is what I've done:
>  > design2 <- cbind(A=1, BvsA=c(0,0,1,1))
>  > fit2 <- lmFit(eset, design2)
>  > fit2eB <- eBayes(fit2)
>  > toptable(fit2eB, n=10)

topTable(fit2eB, coef=2)

Gordon

>             M        t      P.Value        B
> 853  12.40696 143.0466 5.357015e-06 8.649517
> 974  10.69453 133.1012 5.357015e-06 8.580885
> 4813 10.49636 129.6253 5.357015e-06 8.553634
> 4718 11.01015 128.9608 5.357015e-06 8.548209
> 3587 10.24772 126.1652 5.357015e-06 8.524578
> 3812 12.35814 126.1029 5.357015e-06 8.524036
> 2170 11.39107 125.5051 5.357015e-06 8.518801
> 4265 12.62847 123.7714 5.357015e-06 8.503256
> 3372 11.47227 123.5476 5.357015e-06 8.501209
> 345   9.98423 123.4283 5.357015e-06 8.500113
>
> however if I do the following:
>  > design <- model.matrix(~ -1+factor(c(1,1,2,2)))
>  > colnames(design) <- c("A", "B")
>  > fit1 <- lmFit(eset, design)
>  > contrast.matrix <- makeContrasts(A-B, levels=design)
>  > fit12 <- contrasts.fit(fit1, contrast.matrix)
>  > fit1eB <- eBayes(fit12)
>  > toptable(fit1eB, n=10)
>              M          t   P.Value          B
> 311  -3.997113 -13.961019 0.5794648 -0.5112598
> 1327 -1.461801 -11.334987 0.5794648 -0.6889117
> 113  -1.690073 -10.880814 0.5794648 -0.7308602
> 4408 -3.066882 -10.019535 0.5794648 -0.8232774
> 1825 -3.576034  -9.781223 0.5794648 -0.8523026
> 1224 -1.099785  -9.445264 0.5794648 -0.8961448
> 289  -2.800736  -9.306995 0.5794648 -0.9152559
> 288  -1.689312  -8.759499 0.5794648 -0.9977157
> 2892 -2.513426  -8.675603 0.5794648 -1.0113879
> 3311  3.005392   8.377018 0.5794648 -1.0625077
>
> Following the instructions given by Gordon, then I looked at:
>  > i <- (fit2eB$sigma==0)                               # same result
> with fit1eB
>  > sum(i)
> [1] 0
>
> so the probe-sets with zero variance doesn't seem to be the reason here...
> I, of course, would be tempted to believe the 1st option (giving
> differentially expressed genes with B > 8) but it turns out that 96% of
> the genes are differentially expressed in this 1st option, which is
> quite unlikely!
> I can not understand why is it so.
> Any  suggestions and/or indication of what I may have done wrong would
> be gratefully appreciated.
>
> All the best,
> Celine
>
>
>
>
> --
> Celine Carret PhD
> Pathogen Microarrays group
> The Wellcome Trust Sanger Institute
> Hinxton, Cambridge CB10 1SA, UK.
> tel. +44 (0)1223 834 244 ext.7123
> fax. +44 (0)1223 494 919
> email: ckc at sanger.ac.uk
> http://www.sanger.ac.uk/PostGenomics/PathogenArrays/