Fwd: [R] problems with limma

Gordon K Smyth smyth at wehi.EDU.AU
Tue Dec 21 23:18:04 CET 2004


On Wed, December 22, 2004 12:11 am, r.ghezzo at staff.mcgill.ca said:
> ----- Forwarded message from r.ghezzo at staff.mcgill.ca -----
>     Date: Mon, 20 Dec 2004 15:45:11 -0500
>     From: r.ghezzo at staff.mcgill.ca
> Reply-To: r.ghezzo at staff.mcgill.ca
>  Subject: [R] problems with limma
>       To: r-help at stat.math.ethz.ch
>
> I try to send this message To Gordon Smyth at smyth at vehi,edu.au but it bounced
> back, so here it is to r-help

Questions about limma should be sent to the Bioconductor mailing list (the address is given in the
introduction of the Limma User's Guide for example).

> I am trying to use limma, just downloaded it from CRAN. I use R 2.0.1 on Win XP
> see the following:
>> library(RODBC)
>> chan1 <- odbcConnectExcel("D:/Data/mgc/Chips/Chips4.xls")
>> dd <- sqlFetch(chan1,"Raw")   # all data  12000
>> #
>> nzw <- cbind(dd$NZW1C,dd$NZW2C,dd$NZW3C,dd$NZW1T,dd$NZW2T,dd$NZW3T)
>> akr <- cbind(dd$AKR1C,dd$AKR2C,dd$AKR3C,dd$AKR1T,dd$AKR2T,dd$AKR3T)
>> bas <- cbind(dd$NZW1C,dd$NZW2C,dd$NZW3C,dd$AKR1C,dd$AKR2C,dd$AKR3C)
>> #
>>  design<-matrix(c(1,1,1,1,1,1,0,0,0,1,1,1),ncol=2)
>>  fit1 <- lmFit(nzw,design)
>>  fit1 <- eBayes(fit1)
>>  topTable(fit1,adjust="fdr",number=5)
>               M         t      P.Value         B
> 12222  3679.480 121.24612 7.828493e-06 -4.508864
> 1903   3012.405 118.32859 7.828493e-06 -4.508866
> 9068   1850.232  92.70893 1.178902e-05 -4.508889
> 10635  2843.534  91.99336 1.178902e-05 -4.508890
> 561   18727.858  90.17085 1.178902e-05 -4.508893
>> #
>>  fit2 <- lmFit(akr,design)
>>  fit2 <- eBayes(fit2)
>>  topTable(fit2,adjust="fdr",number=5)
>               M        t      P.Value         B
> 88     1426.738 80.48058 5.839462e-05 -4.510845
> 1964  36774.167 73.05580 5.839462e-05 -4.510861
> 5854   7422.578 68.60316 5.839462e-05 -4.510874
> 11890  1975.316 66.54480 5.839462e-05 -4.510880
> 9088   2696.952 64.16343 5.839462e-05 -4.510889
>> #
>>  fit3 <- lmFit(bas,design)
>>  fit3 <- eBayes(fit3)
>>  topTable(fit3,adjust="fdr",number=5)
>              M         t      P.Value         B
> 6262  1415.088 100.78933 2.109822e-05 -4.521016
> 5660  1913.479  96.40903 2.109822e-05 -4.521020
> 11900 4458.489  94.30738 2.109822e-05 -4.521022
> 9358  1522.330  80.46641 3.346749e-05 -4.521041
> 11773 1784.483  73.76620 3.346749e-05 -4.521053
>> #    Now lets do all together in Anova
>> #
>>  all <- cbind(nzw,akr)
>>  ts <- c(1,1,1,2,2,2,3,3,3,4,4,4)
>>  ts <- as.factor(ts)
>>  levels(ts) <- c("nzwC","nzwT","akrC","akrT")
>>  design <- model.matrix(~0+ts)
>>  colnames(design) <- levels(ts)
>>  fit4 <- lmFit(all,design)
>>  cont.matrix <- makeContrasts(
> +      Baseline = akrC - nzwC,
> +      NZW_Smk = nzwT - nzwC,
> +      AKR_Smk = akrT - akrC,
> +      Diff = (akrT - akrC) - (nzwT - nzwC),
> +      levels=design)
>>   fit42 <- contrasts.fit(fit4,cont.matrix)
>>   fit42 <- eBayes(fit42)
>> #
>>   topTable(fit42,coef="Baseline",adjust="fdr",number=5)
>                M         t     P.Value         B
> 3189    942.0993  13.57485 0.004062283 -4.528799
> 8607   2634.1826  11.23476 0.006913442 -4.530338
> 10242  -942.2860 -10.99253 0.006913442 -4.530551
> 283    -609.0831 -10.79354 0.006913442 -4.530735
> 3224  -1564.2572 -10.19429 0.008089034 -4.531351
> ----------------------------------------------------
> ------------- Shouldn't this be equal to fit1 above?
> ----------------------------------------------------
>>   topTable(fit42,coef="NZW_Smk",adjust="fdr",number=5)
>              M         t   P.Value         B
> 7724 -246.5956 -8.687324 0.1615395 -4.591133
> 1403 -307.8660 -7.063312 0.4066814 -4.591363
> 3865 -253.4899 -6.585582 0.4598217 -4.591457
> 3032 -509.2413 -5.841901 0.8294166 -4.591640
> 2490 -240.3259 -5.338679 0.9997975 -4.591795
> ----------------------------------------------------
> ------------- Shouldn't this be equal to fit2 above?
> ------------- The P.Value are unreal!!
> ----------------------------------------------------
>>   topTable(fit42,coef="AKR_Smk",adjust="fdr",number=5)
>              M        t  P.Value         B
> 11547 151.6622 6.380978 0.917470 -4.595085
> 12064 324.0851 6.337235 0.917470 -4.595085
> 6752  964.5478 5.858994 0.952782 -4.595086
> 10251 152.7587 5.339843 0.952782 -4.595087
> 1440  189.6056 4.933151 0.952782 -4.595089
> ----------------------------------------------------
> ------------- Shouldn't this be equal to fit3 above?
> ------------- The P.Value are unreal!!
> ----------------------------------------------------
>>   topTable(fit42,coef="Diff",adjust="fdr",number=5)
>               M         t   P.Value         B
> 7724   302.6892  7.540195 0.4102211 -4.593201
> 1403   419.4962  6.805495 0.4102211 -4.593265
> 10251  270.5269  6.686796 0.4102211 -4.593277
> 3270   409.8391  6.414966 0.4192042 -4.593307
> 10960 -511.4711 -5.469247 0.9652171 -4.593435
>> #
>>
> So the results I get from just pairwise comparisons are very significant, but
> when I try the Anova way, the significance completely dissapears.
> Am I doing something completely wrong?

Your commands look correct but your data looks crazy.  The M-values are supposed to be log-fold
changes, and you're getting values in the 10s of thousands.  Perhaps you are trying analyse data
on the raw intensity scale and have huge differences between arrays and groups or very large
outliers.  Note that the B-statistics are telling you that there is absolutely no differential
expression throughout.  Warning bells should go up when you see such large t-statistics with such
small B-statistics.  Please look at your data and do some quality assessment.  At very least you
probably need to log-transform.

Your 4-group approach should give the same M-values as the 2-group approach, but the standard
errors and t-statistics will change because your error estimates change.

Gordon

> This is data from Affimetrix mouse chips.
> Thanks for any help
> Heberto Ghezzo
> Ph.D.
> Meakins-Christie Labs
> McGill University
> Montreal - Canada




More information about the R-help mailing list