[BioC] DESeq /edgeR design problem for a small RNASeq

Gordon K Smyth smyth at wehi.EDU.AU
Wed Jun 5 03:35:17 CEST 2013

Dear Eduardo,

The design matrix you've used for edgeR is not correct, as I think you've 
already come to understand in your discussions with Simon.

Your four group experiment is of the type covered by Section 3.3 in the 
edgeR User's guide.  Why not have a read and follow the instructions?

I think that trying to interpret model formula in R may be causing you 
some confusion.  I recommend the simpler more intuitive approach described 
in Section 3.3.1 of the edgeR User's Guide.  edgeR gives you a way to 
explicitly say that you want:

   (MTsD - MTcD) - (WTsD -WTcD)

instead of trying to understand the meaning of coefficients in linear 

BTW, please show sessionInfo() output.  Are you using the latest version 
of Bioconductor?  If not, upgrade!

Best wishes

> Date: Mon, 3 Jun 2013 13:06:55 +0200
> From: Eduardo Andr?s Le?n <eandres at cnio.es>
> To: "<bioconductor at r-project.org>" <bioconductor at r-project.org>
> Subject: [BioC] DESeq /edgeR design problem for a small RNASeq
> Hi all,
> 	I'm trying to analyse data coming from a Small RNAseq with a "complex" design.
> I have 2 different kind of mice,a wild type and a mutant. The mutant has 
> an insert to over express a gene of interest. and It's a different mouse 
> from the wild type because the wild type with the insert did't born.

> Second, in order to over express a gene, we add a drug, so we have 4 
> possibilities :
> WTsD (wild type no treated)
> WTcD (wild type treated) . Just to see if adding the drug is also affecting the mouse expression profile
> MTsD (mouse with the insertion) . In this case, to see if the insertion is also affecting the normal expression profile of miRNAs.
> MTcD (mouse with the insertion + drug) . How the over expression affects.
> Of course this is also a time course design because the drug is added at 
> 7 days, 9 days and 11 days,
> So first of all, I'd like to see DE miRNAs taking into account just the 
> overexpression of the gene (that is by removing the changes caused for 
> the drug and for the mouse type ).
> I've tried with DESeq with the following code :
> library(DESeq)
> #Reading count data
> data<-read.table(file="ALL_MTcD_9.5",header=TRUE,row.names=1)
> #Design
> Design<-data.frame(
>  row.names=colnames(data),
>  treatment =c("untreated","untreated","untreated","treated","treated","treated","untreated","untreated","untreated","treated","treated","treated"),
>  mouseType =c("WT","WT","WT","WT","WT","WT","MT","MT","MT","MT","MT","MT")
> )
> cdsFull<-newCountDataSet(data,Design)
> cdsFull<-estimateSizeFactors(cdsFull)
> cdsFull<-estimateDispersions(cdsFull)
> #fit with the mouse and the drug
> experimento<-fitNbinomGLMs(cdsFull,count ~ mouseType + treatment)
> #fit with the mouse + drug + interaction from mouse:drug
> experimento_todos_factores<-fitNbinomGLMs(cdsFull,count ~ mouseType + treatment + mouseType:treatment)
> pvalsGLM<-nbinomGLMTest(experimento_todos_factores,experimento)
> padjGLM<-p.adjust(pvalsGLM,method="BH")
> experimento$pval<-pvalsGLM
> experimento$adj.pval<-padjGLM
> This gives me 21 DE miRNAs.
> As I'm not really sure if this design is correct. I've never seen something like this. So I've also used the edgeR package  using the example 4.4 in the vignette.
> library(edgeR)
> Mouse<-factor(c("WT","WT","WT","WT","WT","WT","MT","MT","MT","MT","MT","MT"))
> Treatment<-factor(c("Untreated","Untreated","Untreated","Treated","Treated","Treated","Untreated","Untreated","Untreated","Treated","Treated","Treated"))
> data.frame(Sample=colnames(y),Mouse,Treatment)
> design<-model.matrix(~Mouse+Treatment)
> rownames(design)<-colnames(y)
> y<-estimateGLMCommonDisp(y,design,verbose=TRUE)
> #Estimation of the gene-wise dispersion
> y<-estimateGLMTrendedDisp(y,design)
> y<-estimateGLMTagwiseDisp(y,design)
> #Diff expresion
> fit<-glmFit(y,design)
> lrt<-glmLRT(fit)
> topedgeR<-topTags(lrt,n=100)
> So I guess that I'm testing the drug adjusting for baseline differences 
> between the 2 mouse types. And I have 26 DE miRNAs
> The problem here is that there is only 5 common miRNAs between DESeq and 
> edgeR.
> Is the design correct ?what I'm doing wrong ?
> Thanks in advance !

The information in this email is confidential and intend...{{dropped:4}}

More information about the Bioconductor mailing list