[BioC] Finding differentially expressed genes using Ebayes

Mon May 11 17:55:32 CEST 2009

Hi all. 
Thanks for all your suggestions and help, Sean, James and Saroj. 

Before I saw the last two messages, I tried this based on the Limma manual and it seems to agree with Samr now.

design <- model.matrix(~ -1+factor(c(1,1,1,2,2,2)));
colnames(design) <- c("group1", "group2");
fit <- lmFit(eset, design);
contrast.matrix <- makeContrasts(group2-group1, levels=design);
fit2 <- contrasts.fit(fit, contrast.matrix);
fit2 <- eBayes(fit2);
topTable(fit2, coef=1, adjust="fdr")

I assume this another way of doing Jame's suggestion. 

Thanks a lot again. 

John.

=================================
Bioinformatics Officer

Molecular Angiogenesis group
First Floor, Institute of Biomedical Research
The University of Birmingham
Medical School
Edgbaston
Birmingham
B15 2TT  (UK)

j.m.herbert at bham.ac.uk

Tel:  +44 121 414 3733
Fax: +44 121 415 8677 

-----Original Message-----
From: Saroj K Mohapatra [mailto:saroj at vt.edu]
Sent: Mon 11/05/2009 14:42
To: John Herbert
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] Finding differentially expressed genes using Ebayes

Hi,

I am trying to understand the situation. Creating a contrast matrix 
(Sean's suggestion) is a very good idea. I usually do it that way.

For the current problem, the design matrix would do the following.

Assuming that the samples 1:3 belong to normal and 4:6 belong to cancer.

 > design <- model.matrix(~factor(c(1,1,1,2,2,2)))

produces:

 > design
  (Intercept) factor(c(1, 1, 1, 2, 2, 2))2
1           1                            0
2           1                            0
3           1                            0
4           1                            1
5           1                            1
6           1                            1
attr(,"assign")
[1] 0 1
attr(,"contrasts")
attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2))`
[1] "contr.treatment"

Here 'design' constitutes the matrix X in the linear models:
y_g = X*beta_g + epsilon_g

where beta is a 2-vector of unknown parameters defining overall and 
differential expression averages for gene g, y is the log(R/G) for gene 
g, and epsilon_g is a zero-mean random disturbance

In this application, X is 6 x 2. for the first 3 samples on gene g we have,
y_g = beta1_g + epsilon_g

and for the 4th through the sixth samples we have
y_g = beta1_g + beta2_g + epsilon_g

beta1 is thus the average log ratio comparing the normal to pooled 
reference, and beta2 is mean log(Cancer/Ref)-log(Normal/Ref) = mean 
log(Cancer/Normal)

Therefore, I think, using the second coef in the function topTable would 
produce result for the comparison of interest (cancer vs normal):
probes_interesting <- topTable(fit, coef=2, n=20);

Let me quickly add that I am relatively new to limma; please confirm my 
approach or correct if I am wrong some where.

Best wishes,

Saroj

John Herbert wrote:
> Hello.
> I have 6 gpr files, 3 from cancer and 3 from normals. I have normalised them with printtiplowess and removed the control spots. My question is about finding differentially expressed genes with Ebayes. I have run the code below and get some strange results that are very different to using Samr. I am thinking I trust samr more as when I look at the results by eye, they seem to order the genes coffectly based on fold ratio. However, it could be that I am not using ebayes correctly. 
>
> So, the design is supposed to represent cancer (the 1s) and normal (the 2s). Is this the correct way to construct a design matrix? I want to compare gene expression between cancer and normal (3 replicates each).  
>
> design <- model.matrix(~factor(c(1,1,1,2,2,2)));
>
> fit <- lmFit(probes_only.imp, design);
>
> fit <- eBayes(fit);
>
> probes_only.imp_50_ebayes <- topTable(fit, n=20);
>
> Any help appreciated on this, thank you. 
>
> Kind regards,
>
> John.
>
>
> =================================
> Bioinformatics Officer
>
> Molecular Angiogenesis group
> First Floor, Institute of Biomedical Research
> The University of Birmingham
> Medical School
> Edgbaston
> Birmingham
> B15 2TT  (UK)
>
> j.m.herbert at bham.ac.uk
>
> Tel:  +44 121 414 3733
> Fax: +44 121 415 8677 
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>