[BioC] multi-level, multi-factor results interpretation

Ingrid Lindquist iel at ncgr.org
Wed Apr 25 21:32:07 CEST 2012


Hi Simon,

2 questions:

(1)
I am working with a multi-factor experiment that, in a sense, doesn't 
have biological replicates (replicates were done at different times, and 
the differences in the chemistry/instrumentation is enough to be a 
rather large component of variance) and has three levels.  By three 
levels, I mean there are 2 different treatments and 1 control.  When I 
push the workflow through, everything works fine, but I'm having a hard 
time delineating which level is responsible for a gene being identified 
as significantly differentially expressed. I understand that (in the 
following example) ConditionX, ConditionY, AlternativeVariableold are 
log2 fold changes in relation to "ctrl" (in the case of condition) or 
"new" (in the case of AlternativeVariable).  Can I assume that the 
largest FC of these three fields within each gene instance is likely the 
reason that gene had a padj within significance? My plan is to separate 
out the dataset so that I'm only running 2 levels at once (x vs ctrl; y 
vs ctrl), but I'm hoping you can shed light on how I can interpret these 
results I've mentioned here.

The output I'm referring to has the following headers (generated by the 
last line (write.table) in my workflow below):

Gene    Pval    Padj    Intercept     ConditionX    ConditionY    
AlternativeVariableold    Deviance Converged


My workflow:

counts <- (file, header=TRUE, row.names=1)

Design<- data.frame(
+  row.names = colnames(SampledCounts),
+ Condition = c( "x", "ctrl", "ctrl", "x", "y", "y"),
+ AlterativeVariable= c( "old", "new", "old", "new", "new", "old"))

library(DESeq)

cdsFull <- newCountDataSet (counts, Design)
cdsFull <-estimateSizeFactors(cdsFull)
cdsFull <-estimateDispersions(cdsFull, method="blind", 
sharingMode="fit-only")
fit1<-fitNbinomGLMs(cdsFull, count ~ condition + AlternativeVariable)
fit0<-fitNbinomGLMs(cdsFull, count ~ AlternativeVariable)
pvalsGLM <-nbinomGLMTest(fit1, fit0)
padjGLM <-p.adjust(pvalsGLM, method = "BH")
write.table(data.frame(geneID=row.names(counts(cdsFull)), pval=pvalsGLM, 
padj=padjGLM, fit1), file= "result.txt")


(2)nd question:
Also, a followup question (I've since done the pairwise comparisons w/ 
the mult-factor design); is there a 'baseMean' stored anywhere within 
the multi-factor experiment workflow? I'd like to mock up MvA plots, but 
can't seem to find a value for BaseMean for each gene.

Thanks for your help,
Ingrid



More information about the Bioconductor mailing list