[R] How to make a proper use of blocking in limma using voom

Steve Lianoglou lianoglou.steve at gene.com
Mon Apr 7 09:09:02 CEST 2014


Hi,

This is a bioconductor-related question so please post any follow up
questions on that mailing list. You can sign up to that list here:

https://stat.ethz.ch/mailman/listinfo/bioconductor

Comments in line:


On Sun, Apr 6, 2014 at 10:41 PM, Catalina Aguilar Hurtado
<catagui at gmail.com> wrote:
> Hi all,
>
> I have a RNAseq data to analyse were I have a control and a one treatment
> for different individuals. I need to block the effects of the individual,
> but I am having several troubles to get the data that I need. I am using
> voom because my data is very heterogeneous and voom seams to do a good job
> normalising my reads.
>
> I am having the following issues:
>
>    1.
>
>    I want to get the differentially expressed genes (DEGs) of my treatment
>    not of my control. I don't understand after the eBayes analysis why I get
>    the coefficients for both. I have tried a > makeContrasts (TreatvsCont=
>    c2-co, levels = design) to subtract the control effect but then I get 0
>    DEGs.
>    2.
>
>    I am not sure when to include the 0 (null model) in the model formula, I
>    have read examples for both types of models.
>
> This are my targets, with my column names of my counts, individual and
> condition
>
>>targets
>
> Individual condition
>
> A1 1 co
>
> A2 3 co
>
> A4 4 co
>
> A5 5 co
>
> E1 1 c2
>
> E2 2 c2
>
> E3 3 c2
>
> E4 4 c2
>
> E5 5 c2
>
> This is the code I have been trying:
>
>>co2=as.matrix(read.table("2014_04_02_1h_PB.csv",header=T, sep=",",
> row.names=1))
>
>>nf = calcNormFactors (co2)
>
>>targets= read.table ("targets.csv", header = T, sep=",",row.names=1)
>
>>treat <- factor (targets$condition, levels= c("co", "c2"))
>
>>design <- model.matrix(~0+treat)
>
>>colnames (design) <- levels (treat)
>
>>y <- voom(co2,design,lib.size=colSums(co2)*nf)
>
>>corfit <- duplicateCorrelation(y,design,block=targets$Individual)
>
>>fit <-
> lmFit(y,design,block=targets$Individual,correlation=corfit$consensus)
>
>>fit2<- eBayes (fit)
>
>>results_trt <- topTable (fit2, coef="c2", n=nrow (y), sort.by="none")
>
> >From which gives me 18,000 genes with adj.P.Val < 0.01 out of 22,000 genes
> that I have in total. Which makes no sense..

This is because you defined your model matrix to have no intercept
term (that's what the 0 in `~ 0 + treat` is doing).

You will have to test for a particular contrast (not just a coeficient
in your design) in order to get what you are after. Sections 9.7 (
Multi-level Experiments) and 16.3 (
Comparing Mammary Progenitor Cell Populations with Illumina BeadChips)
in the lima user's guide may be the most useful for you to follow
along at this point:

http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf

(look for the `makeContrasts` call)

After you call `fit <- lmFit( ... )` in your code above you might do
something like:

R> cm <- makeContrasts(co2Vsco=treatco2 - treatco0, levels=design)
R> fit2 <- eBayes(contrasts.fit(fit, cm))
R> res <- topTable(fit2, coef='co2Vsco')

Note that the `treatco2` and `treatco` are only correct if these are
the column names of your design matrix -- substitute with the
appropriate names for your example, if necessary.

> Thanks in advance for the help.

Noodle on that a bit and if you still have questions, please do post a
follow up question on the bioconductor list.

Btw, to help make your question more interpretable, since we don't
have your targets file, I think it would be easier for us if you
copy/paste the output of `dput(targets)` and `dput(design)` after you
create those object in a follow up email if it's necessary to write
one.

HTH,
-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech




More information about the R-help mailing list