[BioC] edgeR: GLM for multi-factor and mulit-level designs

Dorota Herman dorota.herman at psb.vib-ugent.be
Tue Nov 27 16:18:20 CET 2012


Hello everyone,

I have been playing with the GLM approach for RNA-seq data in DESeq and 
edgeR but I am fairly new in DE analyses. I am interested in pairwise 
comparisons in multi-factor multi-level designs. My question concerns my 
understanding of an application of the glmLRT function

#My code is
>countsTable <- read.delim(file)
>header <- 
c('A_1','A_2','A_3','B_1','B_2','B_3','C_1','C_2','C_3','D_1','D_2','D_3')
>names(countsTable) <- header
>conds <- factor(c('A','A','A’,'B','B','B','C','C','C','D','D','D'))
>Ex<-factor(c('exper1', 'exper2', 'exper3', 'exper2', 'exper3', 'exper4', 
'exper1', 'exper3', 'exper4', 'exper2', 'exper3', 'exper4'))
>group <- conds
>dge <- DGEList(counts=countsTable,group=group)
>dge <- calcNormFactors(dge, method='TMM')
>design <- model.matrix(~Ex+conds)
>rownames(design)<-colnames(dge)
>dge <- estimateGLMCommonDisp(dge,design)
>dge <- estimateGLMTrendedDisp(dge, design)
>dge <- estimateGLMTagwiseDisp(dge, design)
>fit <- glmFit(dge, design)

#my design looks like:
>  design
(Intercept) Eexper2 Eexper3 Eexper4 condsB condsC condsD
A_1 1 0 0 0 0 0 0
A_2 1 1 0 0 0 0 0
A_3 1 0 1 0 0 0 0
B_1 1 1 0 0 1 0 0
B_2 1 0 1 0 1 0 0
B_3 1 0 0 1 1 0 0
C_1 1 0 0 0 0 1 0
C_2 1 0 1 0 0 1 0
C_3 1 0 0 1 0 1 0
D_1 1 1 0 0 0 0 1
D_2 1 0 1 0 0 0 1
D_3 1 0 0 1 0 0 1
attr(,"assign")
[1] 0 1 1 1 2 2 2
attr(,"contrasts")
attr(,"contrasts")$E
[1] "contr.treatment"
attr(,"contrasts")$conds
[1] "contr.treatment"

I understand that R function rewrote the model matrix because of the 
identifiability problem for parameter estimations. However it causes my 
confusion in further usage of that design for the pairwise comparisons.

In a case when I want to obtain differentially expressed genes between A 
and B, I understand I should use the function:
>lrt <- glmLRT(fit,coef="condsB")
Is it correct?

In a case when I want to obtain differentially expressed genes between C 
and D (*without taking into account A*), are these calling functions 
correct?
>C_D<-makeContrasts(condsC-condsD,levels=design)
>lrt <- glmLRT(fit,contrast=C_D)

Does it mean that glmLRT function takes into account first conds (A) 
when we use ‘coef’ parameter and discard it when we use ‘contrast’ 
parameter? Or it means that the second analysis, between C and D takes 
into account differential expression with A too?

I hope my explanation of the question is not too confusing.
Best wishes
Dorota

-- 
==================================================================
Dorota Herman, PhD 
VIB Department of Plant Systems Biology, Ghent University
Technologiepark 927
9052 Gent, Belgium
Tel: +32 (0)9 3313692
Email:dorota.herman at psb.vib-ugent.be
Web: http://www.psb.ugent.be



More information about the Bioconductor mailing list