[BioC] Experimental design with edgeR and DESeq packages (RNA-seq)

Yvan Wenger yvan.wenger at unige.ch
Thu Nov 15 12:09:10 CET 2012


Hi everybody,

I just started using edgeR and DESeq and am looking for a confirmation
that I am not doing a silly thing.

Basically, we have 7 conditions and for only 2 of these sample we have
biological triplicates. Let us say that the samples are "A", "A", "A",
"B" , "C" (most of the genes are NOT regulated in my experiment).
Finally, let us say we just want to compare "B" to "C", but using all
the information available. Can we use all the dataset for estimating
the common and tagwise dispersion? Typically using the commands (note
that I compare here "B" to "C", thus samples without replicates).

edgeR:
countTable=read.table('mytable',header=F,row.names=1) ; dge <-
DGEList(counts=countTable,group=c("A","A","A,"B","C")) ; dge <-
calcNormFactors(dge) ; dge <- estimateCommonDisp(dge) ; dge <-
estimateTagwiseDisp(dge) ; et <- exactTest(dge, pair=c("B","C"))

or

DESeq:
countTable = read.table('mytable.csv', header=F,row.names=1) ; design
= data.frame(row.names = colnames(countTable),condition =
c("A","A","A,"B","C")) ; condition =
design$condition;cds=newCountDataSet(countTable,condition);cds=estimateSizeFactors(cds);cds=estimateDispersions(cds);res=nbinomTest(cds,"B","C")

Is it ok to do so (to use samples not compared in the end to estimate
the dispersion) Does this correspond to the example "working partially
without replicates" from the DESeq manual) ? Or should I just consider
that there is no replicates for sample B and C and proceed by ignoring
other samples completely ?

Many thanks !

Yvan



More information about the Bioconductor mailing list