[BioC] Questions about using Limma

Wed May 5 15:41:48 CEST 2010

Hi Wei,

wei tie wrote:
> Hello ,
> 
> I am currently working on 30 Affymetrix arrays for a time course
> experiment using Limma package. I have two groups (A and B) with
> samples taken at 0hr, 1hr, 6hr, 12hrs and 24hrs.
> 
> The following is how my targets file looks like:
> FileName      Targets
> Files 1-3       A.0hr
> Files 4-6       A.1hr
> Files 7-9       A.6hr
> Files 10-12     A.12hr
> Files 13-15     A.24hr
> Files 16-18     B.0hr
> Files 19-21     B.1hr
> Files 22-24     B.6hr
> Files 25-27     B.12hr
> Files 28-30     B.24hr
> 
> If I apply the lmFit() function to all the 30 arrays as follows:
> 
> sample<-c("A0","A1","A6","A12","A24","B0","B1","B6","B12","B24")
> all<-factor(rep(sample,each=3),levels=sample)
> design<-model.matrix(~0+all)
> colnames(design)<-sample
> fit1 <- lmFit(eset,design)
> contrast<- makeContrasts(B1-B0,A1-A0, (B1-B0)-(A1-A0),levels=design)
> fit2<- contrasts.fit(fit1, contrast)
> fit2<- eBayes(fit2)
> results1<-decideTests(fit2,method="global",adjust.method="BH",p.value=0.05,lfc=1)
> summary(results1)
>      B1 - B0      A1 - A0          (B1 - B0) - (A1 - A0)
> -1      1941        1133                        1155
> 0      26434       27285                       28635
> 1       1879        1836                         464
> 
> the result is that 1619(1155+464) genes changed at the first 1hr
> differently between group A and group B.
> 
> If I apply the lmFit() function only to the 12 samples taken at 0hr
> and 1hr , and use the following target file:
> FileName      Targets
> Files 1-3       A.0hr
> Files 4-6       A.1hr
> Files 7-9       B.0hr
> Files 10-12     B.1hr
> 
> then use the following script:
> sample1<-c("A0","A1","B0","B1")
> part<-factor(rep(sample,each=3),levels=sample)
> design<-model.matrix(~0+part)
> colnames(design)<-sample1
> fit3 <- lmFit(eset1,design)
> contrast<- makeContrasts(B1-B0,A1-A0, (B1-B0)-(A1-A0),levels=design)
> fit4<- contrasts.fit(fit3, contrast)
> fit4<- eBayes(fit4)
> results2<-decideTests(fit4,method="global",adjust.method="BH",p.value=0.05,lfc=1)
> summary(results2)
>      B1 - B0      A1 - A0           (B1 - B0) - (A1 - A0)
> -1      1501         825                         563
> 0      27478       28194                       29483
> 1       1275        1235                         208
> 
> the genes changing differently between group A and group B in terms of
> interaction decreased to 771(563+208).
> 
> If I use the method="separate" in decideTests()
> Results3<-decideTests(fit4,method="separate",adjust.method="BH",p.value=0.05,lfc=1)
> summary(results3)
>       B1 - B0     A1 - A0           (B1 - B0) - (A1 - A0)
> -1      1763         876                          67
> 0      26915       28058                       30169
> 1       1576        1320                          18
> 
> the genes which changed differently between group A and group B are
> much fewer (67+18=85).
> 
> Why do I get different lists of differentially expressed genes when I
> use the three approaches? Which result should I choose?

You get different results because you have different degrees of freedom 
for your statistics when you go from using all data to a subset. Fewer 
degrees of freedom translates to less power to detect differences.

When you use global vs separate for decideTests, you are adjusting for 
multiplicity over different numbers of comparisons (when you use global, 
you pile all the p-values into one vector and then do multiplicity 
adjustment, when using separate, the adjustment is done by comparison).

I don't think you will find anybody who is willing to tell you which one 
you should choose, especially via email. You might consider consulting 
with a local statistician if you need help.

Best,

Jim

> 
> I would really appreciate if someone can give some help.
> 
> Thank you,
> Regards,
> Wei Tie
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues