[BioC] setting contrast in edgeR

Ryan C. Thompson rct at thompsonclan.org
Sat Jan 5 00:46:54 CET 2013


Hello Alpesh,

If you inspect your design matrix, you should see that it contains an 
intercept column (named "(Intercept)" and two contrast columns (named 
"grp2" and "grp3"). It looks like you are rather expecting one column 
per group. With the design matrix that you specify, the intercept 
column represents cond1, and the other two columns represent 
"cond2-cond1" and "cond3-cond1" respectively, Therefore your specified 
contrast adds up thusly:

-1 * (cond1) + 0.5 * (cond2-cond1) + 0.5 * (cond3 - cond1) = (-cond1) + 
0.5 * (cond2 + cond3) - cond1 = (cond2+cond3)/2 - 2 * cond1.

In your artificial case where all conditions are equal, this clearly 
adds up to -cond1, not zero, so you will see significant calls. If you 
were expecting a design matrix with one column for each condition, you 
can get it by including a zero in the call to model.matirx, which will 
leave out the "(Intercept)" term, like this:

 design <- model.matrix(~0+grp)

Then the contrast that you are using should be correct. Alternatively, 
if I understand you correctly, you can use the existing design matrix 
with the intercept column and change your contrast to c(0, .5, .5). 
Also, if you want "cond1 - mean(cond2, cond3)", I think you will want 
to swap the signs of the elements in your contrast. This won't change 
the p-values, but it will make the fold changes have the opposite sign.

Please check your work, because I may misremember some of the details 
above, but I'm pretty sure the root of your problem is using a design 
matrix with an intercept when you expected one without an intercept.

Regards,

-Ryan Thompson

On Fri 04 Jan 2013 02:53:25 PM PST, Alpesh Querer wrote:
> Hi,
>
> I`m trying to find significant DE genes using the contrast (-1,.5,.5).
> I want to test {cond1-mean(cond2,cond3)}
> I created a data-set where all conditions have exact same values, divided
> into 3 conditions and 3 replicates/condition.
>
> Here is the data I`m using.
>
> 150    150    150    150    150    150    150    150    150
> 510    510    510    510    510    510    510    510    510
> 550    550    550    550    550    550    550    550    550
> 500    500    500    500    500    500    500    500    500
> 501    501    501    501    501    501    501    501    501
> 505    505    505    505    505    505    505    505    505
> 50    50    50    50    50    50    50    50    50
>
>
> If the run the following code, I get very significant p-values (in lrt
> table) and all genes are declared to be up-regulated.
> Am I interpreting the output incorrectly or do I need to set up the
> contrast in a different way?
>
>    h <- read.delim("test10.txt")
>    h <-h[rowSums(h)>0,]
>     grp <- factor(c(1,1,1,2,2,2,3,3,3))
>     design <- model.matrix(~grp)
>     h <- DGEList(counts=h,lib.size=colSums(h),group=grp)
>    h <- calcNormFactors(h)
>    h <- estimateGLMCommonDisp(h, design, verbose=TRUE)
>    h <- estimateGLMTrendedDisp(h, design)
>    h <- estimateGLMTagwiseDisp(h, design)
>   fit <- glmFit(h, design)
>    lrt <- glmLRT(fit,contrast=c(-1,0.5,0.5))
>    summary(de <- decideTestsDGE(lrt, p=0.05, adjust="BH"))
>
>     [,1]
> -1    0
> 0     0
> 1     6
>
>
>
>
>
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> States.1252    LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C                           LC_TIME=English_United
> States.1252
>
> attached base packages:
> [1] splines   stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
>   [1] epicalc_2.15.1.0        nnet_7.3-5
> MASS_7.3-22             survival_2.36-14        foreign_0.8-51
> edgeR_3.0.4
>   [7] limma_3.14.3            goseq_1.10.0
> geneLenDataBase_0.99.10 BiasedUrn_1.04          GenomicFeatures_1.10.1
> AnnotationDbi_1.20.3
> [13] Biobase_2.18.0          GenomicRanges_1.10.5
> IRanges_1.16.4          BiocGenerics_0.4.0      BiocInstaller_1.8.3
>
> loaded via a namespace (and not attached):
>   [1] biomaRt_2.14.0     Biostrings_2.26.2  bitops_1.0-5
> BSgenome_1.26.1    DBI_0.2-5          grid_2.15.2        lattice_0.20-10
> Matrix_1.0-10
>   [9] mgcv_1.7-22        nlme_3.1-105       parallel_2.15.2
> RCurl_1.95-3       Rsamtools_1.10.2   RSQLite_0.11.2     rtracklayer_1.18.1
> stats4_2.15.2
> [17] tools_2.15.2       XML_3.95-0.1       zlibbioc_1.4.0
>
> Thanks,
> Al
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list