[BioC] LIMMA: Why does eBayes expression differ from observed?
Edwin Groot
edwin.groot at biologie.uni-freiburg.de
Fri Jul 31 11:51:52 CEST 2009
On Thu, 30 Jul 2009 14:57:43 +0200
Axel.Klenk at Actelion.Com wrote:
>
> Dear Edwin,
>
> first of all, the code you have posted cannot work because it never
> assigns
> the results
> of your computations to any variables. In order to help you, we need
> to
> know at least
> 1) what code you have really run,
> 2) your refdesign matrix,
> 3) the values from topTable() and RG.MA() for at least one example
> probe,
> 4) HOW you backcalculated R, G, and FC from topTable(), and
> 5) the obligatory output of sessionInfo() although this doesn't
> really
> sound like a version issue. :-)
>
---
Here is the code, with minimal commenting:
> refdesign <- modelMatrix(targets, ref="WT")
> refdesign
cf1 mut1 mut1_cf1
[1,] 0 1 0
[2,] 1 0 0
[3,] 0 0 1
[4,] 0 1 0
[5,] 1 0 0
[6,] 0 0 1
[7,] 0 1 0
[8,] 1 0 0
[9,] 0 0 1
> RG <- read.maimages(targets$FileName, columns = list(G =
"gProcessedSignal", Gb = "gBGMeanSignal", R = "rProcessedSignal", Rb =
"rBGMeanSignal"), annotation= c("FeatureNum", "Row", "Col", "ProbeUID",
"ControlType", "ProbeName", "Description", "GeneName",
"SystematicName"))
> i <- RG$genes$ControlType==0
> RGnc <- RG[i,]
#RGnc already contains BG-subtracted and normalized intensities
> MA <- MA.RG(RGnc, bc.method="none")
#Intensity spread among arrays dissimilar
> MAAq <- normalizeBetweenArrays(MA, method="Aquantile")
> fit <- lmFit(MAAq, refdesign)
> fitWT <- eBayes(fit)
> RGAq <- RG.MA(MAAq)
> options(digits=2)
> out.mut1 <- topTable(fitWT, coef="mut1",
genelist=cbind(fit$genes$GeneName,RGnc$G[,1],RGnc$R[,1],RGAq$G[,1:9],RGnc$G[,1],RGAq$R[,1],RGAq$R[,4],RGAq$R[,7],RGnc$R[,1]))
#Average WT (green) intensity
> out.mut1[,2] <- (2*2^out.mut1$AveExpr)/(2^out.mut1$logFC+1)
#Average mut1 (red) intensity
> out.mut1[,3] <- (2*2^out.mut1$AveExpr)/(2^-out.mut1$logFC+1)
> out.mut1[,2:17] <- as.numeric(as.matrix(out.mut1[,2:17]))
#Calculated mean WT intensity
> for (g in 1:10) out.mut1[g,13] <-
mean(as.numeric(as.matrix(out.mut1[g,4:12])))
#Calculated mean mut1 intensity
> for (g in 1:10) out.mut1[g,17] <-
mean(as.numeric(as.matrix(out.mut1[g,14:16])))
#Fold Change
> out.mut1$logFC <- 2^out.mut1$logFC
> colnames(out.mut1)[1:18] <-
c("GeneName","AvWT","Avmut1","WT1","WT2","WT3","WT4","WT5","WT6","WT7",
"WT8","WT9","CalcWT","mut11","mut12","mut13","Calcmut1","FoldChange")
> out.mut1[9,]
GeneName AvWT Avmut1 WT1 WT2 WT3 WT4 WT5 WT6 WT7 WT8 WT9 CalcWT
10943 AT1G65370 31 298 64 57 37 56 47 59 69 64 69 58
mut11
10943 423
mut12 mut13 Calcmut1 FoldChange AveExpr t P.Value adj.P.Val B
10943 683 727 611 9.5 7.4 12 1.2e-06 0.0055 4.7
---
Pardon the hack with mean(). It did not like the factor levels in the
data frame.
The WT and mut1 replicates, "observed RG", from the MAAq object are in
WT1 to Wt9 and mut11 to mut13.
In the output table out.mut1, the expression calculated from topTable
(AvWT and Avmut1) is about half that of the expression calculated from
the MAAq object (CalcWT and Calcmut1). The question is why are they so
different?
The FC calculated from the MAAq object (10.5) also differs from the FC
from topTable (9.5).
Thanks in advance for your help,
Edwin
---
> AFAIK, eBayes() will affect the t, p, and B statistics computed from
> your M
> values but not
> the AveExpr and logFC.
>
> RG.MA() backcalculates background-adjusted and normalized R and G
> values,
> so I am
> not sure what you mean by "observed" RG -- you don't get back the
> original
> R, Rb, G, Gb
> from an MAList via RG.MA().
>
> Cheers,
>
> - axel
>
>
> Axel Klenk
> Research Informatician
> Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil
> /
> Switzerland
>
>
>
>
>
> "Edwin Groot"
>
> <edwin.groot at biol
>
> ogie.uni-freiburg
> To
> .de> bioconductor at stat.math.ethz.ch
>
> Sent by:
> cc
> bioconductor-boun
>
> ces at stat.math.eth
> Subject
> z.ch [BioC] LIMMA: Why does eBayes
>
> expression differ from
> observed?
>
>
> 30.07.2009 12:38
>
>
>
>
>
>
>
>
>
>
>
>
>
> I am getting odd results from a common reference design analysis of
> two-colour data in LIMMA. Previously I had analyzed only
> simple-comparison designs. Hopefully you can help restore credibility
> of Bioconductor to my supervisor.
> Why does the AveExpr and logFC reported in topTable() differ from the
> replicates in my MA object?
>
> The analysis is a textbook example of comparing a series of mutants
> to
> the wild type. Wild type is always green. The summary is as follows:
>
> lmfit(MA, refdesign)
> eBayes(fit)
> topTable(fit, coef="mut1")
> #Regenerate the RG from MA
> RG.MA(MA)
>
> Backcalculating the topTable AveExpr and logFC to green, red and FC
> gives the following expressions as an example:
> WT: 31 mut1: 298 FC: 9.5
> Compare that to the average of 9 WT and 3 mut1 in the RG.MA(MA) list:
> WT: 58 mut1: 611 FC: 10.5
>
> A survey of other probes gives an over and underestimate of the
> observed RG from 1.5 to 5 times.
> What is the explanation for that?
> What should I troubleshoot, besides looking at the usual BG and FG
> distributions and MA plots?
>
> Regards,
> Edwin
> ---
> Dr. Edwin Groot, postdoctoral associate
> AG Laux
> Institut fuer Biologie III
> Schaenzlestr. 1
> 79104 Freiburg, Deutschland
> +49 761-2032945
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list