[BioC] Finding differentially expressed genes using Ebayes
Saroj K Mohapatra
saroj at vt.edu
Mon May 11 15:42:37 CEST 2009
Hi,
I am trying to understand the situation. Creating a contrast matrix
(Sean's suggestion) is a very good idea. I usually do it that way.
For the current problem, the design matrix would do the following.
Assuming that the samples 1:3 belong to normal and 4:6 belong to cancer.
> design <- model.matrix(~factor(c(1,1,1,2,2,2)))
produces:
> design
(Intercept) factor(c(1, 1, 1, 2, 2, 2))2
1 1 0
2 1 0
3 1 0
4 1 1
5 1 1
6 1 1
attr(,"assign")
[1] 0 1
attr(,"contrasts")
attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2))`
[1] "contr.treatment"
Here 'design' constitutes the matrix X in the linear models:
y_g = X*beta_g + epsilon_g
where beta is a 2-vector of unknown parameters defining overall and
differential expression averages for gene g, y is the log(R/G) for gene
g, and epsilon_g is a zero-mean random disturbance
In this application, X is 6 x 2. for the first 3 samples on gene g we have,
y_g = beta1_g + epsilon_g
and for the 4th through the sixth samples we have
y_g = beta1_g + beta2_g + epsilon_g
beta1 is thus the average log ratio comparing the normal to pooled
reference, and beta2 is mean log(Cancer/Ref)-log(Normal/Ref) = mean
log(Cancer/Normal)
Therefore, I think, using the second coef in the function topTable would
produce result for the comparison of interest (cancer vs normal):
probes_interesting <- topTable(fit, coef=2, n=20);
Let me quickly add that I am relatively new to limma; please confirm my
approach or correct if I am wrong some where.
Best wishes,
Saroj
John Herbert wrote:
> Hello.
> I have 6 gpr files, 3 from cancer and 3 from normals. I have normalised them with printtiplowess and removed the control spots. My question is about finding differentially expressed genes with Ebayes. I have run the code below and get some strange results that are very different to using Samr. I am thinking I trust samr more as when I look at the results by eye, they seem to order the genes coffectly based on fold ratio. However, it could be that I am not using ebayes correctly.
>
> So, the design is supposed to represent cancer (the 1s) and normal (the 2s). Is this the correct way to construct a design matrix? I want to compare gene expression between cancer and normal (3 replicates each).
>
> design <- model.matrix(~factor(c(1,1,1,2,2,2)));
>
> fit <- lmFit(probes_only.imp, design);
>
> fit <- eBayes(fit);
>
> probes_only.imp_50_ebayes <- topTable(fit, n=20);
>
> Any help appreciated on this, thank you.
>
> Kind regards,
>
> John.
>
>
> =================================
> Bioinformatics Officer
>
> Molecular Angiogenesis group
> First Floor, Institute of Biomedical Research
> The University of Birmingham
> Medical School
> Edgbaston
> Birmingham
> B15 2TT (UK)
>
> j.m.herbert at bham.ac.uk
>
> Tel: +44 121 414 3733
> Fax: +44 121 415 8677
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list