[BioC] sva + voom + limma

Gordon K Smyth smyth at wehi.EDU.AU
Tue Sep 3 13:01:54 CEST 2013


> Date: Mon, 2 Sep 2013 19:35:07 +0200
> From: Meritxell Oliva <meritxellop at gmail.com>
> To: bioconductor at r-project.org
> Cc: Mario C?ceres <mcaceres at icrea.cat>
> Subject: [BioC] sva + voom + limma
>
> Dear Bioconductor list,
> Dear Jeff Leek & Gordon Smith,
> I want to use sva() to estimate potential surrogate variables of a RNASeq derived expression dataset, as a previous step to perform differential gene expression analysis with limma(), previously using voom() to transform RNASeq to microarray-like expression data.
> As far as I know, SVA was originally designed to deal with ( normally distibuted ) microarray expression data, but can also be used to work with RNSeq data. Please, correct me if I am wrong here!
> So, I first transform the raw counts into cpm-log2 values, by using edgeR function calcNormFactors() and voom(). I apply sva() on the transformed dataset to compute the surrogate variables. Then, I build the design matrix to create a linear model with my primary variable of interest (InvGeno, a quantitative discrete variable: 0,1,2 ) and the set of surrogate variables, and I finally apply voom()+lmFit()+eBayes() to obtain DE candidates:
> ###
> y <- calcNormFactors(rawCounts_epression_dataset_matrix);
> mod1 <- model.matrix(~InvGeno);
> mod0 <- model.matrix(~1,data=InvGeno);
> v <- voom(counts=y, design = mod1)
> sva.obj <- sva(v$E, mod1, mod0,method="irw",n.sv=10);

It's a pity that sva() can't use weights, but I can't suggest anything 
better.

> mod1 <- model.matrix(~InvGeno+sva.obj$sv);
> v <- voom(counts=y, design = mod1);
> fit.obj <- lmFit(v, design);

I think you mean design=mod1.

> fit.obj <- eBayes(fit,trend=TRUE);

trend=TRUE isn't needed.  voom() already handles the trend.

Gordon

> ###
> As voom needs to be fed by raw counts and performs the cpm+log2 steps internally, I am not sure about the properness of including in the linear model the sva-computed surrogate variables from cpm-log2 values, and the implications that this step may produce in the DE analysis.
> Could you suggest an appropriate strategy so as to achieve my purposes?
> Thanks a lot!!!
>
> Meritxell Oliva
> PhD student
> IBB (Biotechnology and Biomedicine Institute)
> Comparative and Functional Genomics group
> Campus Universitari - 08193 Bellaterra Cerdanyola del Vallès - Barcelona
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:5}}


More information about the Bioconductor mailing list