[BioC] Finding differentially expressed genes using Ebayes

Saroj K Mohapatra saroj at vt.edu
Mon May 11 15:42:37 CEST 2009

Hi,

I am trying to understand the situation. Creating a contrast matrix
(Sean's suggestion) is a very good idea. I usually do it that way.

For the current problem, the design matrix would do the following.

Assuming that the samples 1:3 belong to normal and 4:6 belong to cancer.

> design <- model.matrix(~factor(c(1,1,1,2,2,2)))

produces:

> design
(Intercept) factor(c(1, 1, 1, 2, 2, 2))2
1           1                            0
2           1                            0
3           1                            0
4           1                            1
5           1                            1
6           1                            1
attr(,"assign")
 0 1
attr(,"contrasts")
attr(,"contrasts")\$`factor(c(1, 1, 1, 2, 2, 2))`
 "contr.treatment"

Here 'design' constitutes the matrix X in the linear models:
y_g = X*beta_g + epsilon_g

where beta is a 2-vector of unknown parameters defining overall and
differential expression averages for gene g, y is the log(R/G) for gene
g, and epsilon_g is a zero-mean random disturbance

In this application, X is 6 x 2. for the first 3 samples on gene g we have,
y_g = beta1_g + epsilon_g

and for the 4th through the sixth samples we have
y_g = beta1_g + beta2_g + epsilon_g

beta1 is thus the average log ratio comparing the normal to pooled
reference, and beta2 is mean log(Cancer/Ref)-log(Normal/Ref) = mean
log(Cancer/Normal)

Therefore, I think, using the second coef in the function topTable would
produce result for the comparison of interest (cancer vs normal):
probes_interesting <- topTable(fit, coef=2, n=20);

Let me quickly add that I am relatively new to limma; please confirm my
approach or correct if I am wrong some where.

Best wishes,

Saroj

John Herbert wrote:
> Hello.
> I have 6 gpr files, 3 from cancer and 3 from normals. I have normalised them with printtiplowess and removed the control spots. My question is about finding differentially expressed genes with Ebayes. I have run the code below and get some strange results that are very different to using Samr. I am thinking I trust samr more as when I look at the results by eye, they seem to order the genes coffectly based on fold ratio. However, it could be that I am not using ebayes correctly.
>
> So, the design is supposed to represent cancer (the 1s) and normal (the 2s). Is this the correct way to construct a design matrix? I want to compare gene expression between cancer and normal (3 replicates each).
>
> design <- model.matrix(~factor(c(1,1,1,2,2,2)));
>
> fit <- lmFit(probes_only.imp, design);
>
> fit <- eBayes(fit);
>
> probes_only.imp_50_ebayes <- topTable(fit, n=20);
>
> Any help appreciated on this, thank you.
>
> Kind regards,
>
> John.
>
>
> =================================
> Bioinformatics Officer
>
> Molecular Angiogenesis group
> First Floor, Institute of Biomedical Research
> The University of Birmingham
> Medical School
> Edgbaston
> Birmingham
> B15 2TT  (UK)
>
> j.m.herbert at bham.ac.uk
>
> Tel:  +44 121 414 3733
> Fax: +44 121 415 8677
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>