[BioC] Agi4x44Preprocess/Limma and number of significant p-values

Wed Nov 2 20:47:25 CET 2011

> 
Hi everyone

I have used Agi4x44Preprocess (script below) to analyze three mouse (mgug4122a) Agilent microarray sets and I want to check if the methods I'm using make sense, as the number of significant p-values I'm getting seems quite high. I'm using Limma for the significance analysis. the lowest value I'm getting is 1.09667003222346e-12, while the adjusted p-value 2.43208513046197e-08. If I consider significant p-value below 0.01, I get 9721 significant ones, roughly half of the gene list. This was for one of the datasets, with 2 (four result sets) micorarrays, 1 for each treatment. For another similar dataset, 4 microarrays with 2 samples for each treatment, results are very similar.

I also tried the approach in this page http://matticklab.com/index.php?title=Single_channel_analysis_of_Agilent_microarray_data_with_Limma and the results seem better (I mean less p-values below 0.01). 

I was wondering what would be the suggested approach and if there's any reason that I'm getting so many values below 0.01, with some extreme values at 10-12 range. Is there a bug on Agi4x44Preprocess? It seism to get the same columns as the approach on the MAttick's lab webpage.

Thanks in advance for any help 

Paulo

library(Agi4x44PreProcess)
library(mgug4122a.db)
targets=read.targets(infile="targets.txt")
dd=read.AgilentFE(targets, makePLOT=FALSE)

CV.rep.probes(dd, "mgug4122a.db", foreground = "MeanSignal", raw.data = TRUE, writeR = T, targets)
genes.rpt.agi(dd, "mgug4122a.db", raw.data = TRUE, WRITE.html = T, REPORT = T)
ddNORM = BGandNorm(dd, BGmethod = "half", NORMmethod = "quantile", foreground = "MeanSignal", background = "BGMedianSignal", offset = 50, makePLOTpre = FALSE, makePLOTpost = F)

ddFILT = filter.probes(ddNORM, control = TRUE, wellaboveBG = TRUE,
                      isfound = TRUE, wellaboveNEG = TRUE, sat = TRUE, PopnOL = TRUE,
                      NonUnifOL = T, nas = TRUE, limWellAbove = 75, limISF = 75,
                      limNEG = 75, limSAT = 75, limPopnOL = 75, limNonUnifOL = 75,
                      limNAS = 100, makePLOT = F, annotation.package = "mgug4122a.db",
                      flag.counts = T, targets)
ddPROC = summarize.probe(ddFILT, makePLOT = F, targets)
esetPROC = build.eset(ddPROC, targets, makePLOT = F, annotation.package = "mgug4122a.db")
write.eset(esetPROC,ddPROC,"mgug4122a.db",targets)

levels.treatment = levels(factor(targets$Treatment))
treatment = factor(targets$Treatment, levels = levels.treatment)
design = model.matrix(~0 + treatment)
print(design)
colnames(design) = c("EV50", "SHGR19")
fit = lmFit(esetPROC, design)
CM = makeContrasts(EV50vsSHGR19=EV50-SHGR19, levels=design )

fit2 = contrasts.fit(fit, CM)
fit2 = eBayes(fit2)
my.lfc <- 0
output <- topTable(fit2, coef=1, adjust.method="BH", lfc=my.lfc, number = 20000)
write.table(output, file="output.txt", sep="\t", quote=FALSE)