[BioC] queries re Voom + Limma for RNA-seq data

Gordon K Smyth smyth at wehi.EDU.AU
Wed Apr 2 01:37:20 CEST 2014


Hi Guido,

> Date: Mon, 31 Mar 2014 19:20:34 +0000
> From: "Hooiveld, Guido" <guido.hooiveld at wur.nl>
> To: "bioconductor at r-project.org" <bioconductor at r-project.org>
> Subject: [BioC] queries re Voom + Limma for RNA-seq data
>
> Hello Gordon and all,

> I am analyzing a RNA-seq experiment, and since I have quite some 
> experience analyzing microarray data using linear models, I decided to 
> go for the 'Voom - Limma' workflow. However, I do have some (practical) 
> queries, and would appreciate feedback on these:

> - when transforming the counts data using voom(), is an addition 
> normalization method recommended? E.g. cyclic loess? 
> (normalize.method="cyclicloess"). Note that before applying Voom I 
> already processed the count data with calcNormFactors().

No.  If you've already used calcNormFactors(), then no additional 
normalization is recommended.  voom will use the values computed by 
calcNormFactors and put into the DGEList object.

> - when applying the eBayes moderated statistical tests, is it still OK 
> to use "trend=TRUE"? fit2 <- eBayes(fit,trend=TRUE).

No.

> I am asking because voom() also estimates the mean-variance relationship 
> in the data, and I don't want to 'over-correct'. BTW, in the plotSA() 
> graphs I noticed only a minor effect of trend=TRUE.

Exactly. voom has already accounted for the trend.

Best wishes
Gordon

> Thanks,
> Guido
>
>
> library(limma)
> samples <- read.delim("samples_descriptions.txt")
> counts <- read.delim("count_table_geneLevel_1703.txt", row.names=1)
>
> isexpr <- rowSums(cpm(counts)>1) >= 3
> counts <- counts[isexpr,]
> #check: > dim(counts)  [1] 10658    18; (62.1% of genes are retained)
>
> d  <- DGEList(counts = counts, group = samples$Condition)
> d  <- calcNormFactors(d)
>
> treatment <- as.factor(samples$Condition)
> design <- model.matrix(~0+diet)
>
> v <- voom(d,design,plot=TRUE)
>
> # defined contrast matrix, then:
> fit <- lmFit(v,design)
> fit1 <- contrasts.fit(fit,cont.matrix)
>
> fit2 <- eBayes(fit1,trend=TRUE)
> plotSA(fit2)
>
> topTable(fit2)
>
>> sessionInfo()
> R Under development (unstable) (2013-11-19 r64265)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] splines   parallel  stats     graphics  grDevices utils     datasets
> [8] methods   base
>
> other attached packages:
> [1] edgeR_3.5.28       multtest_2.19.2    Biobase_2.23.6     BiocGenerics_0.9.3
> [5] limma_3.19.28
>
> loaded via a namespace (and not attached):
> [1] MASS_7.3-29     stats4_3.1.0    survival_2.37-4 tools_3.1.0
>>
>
> ---------------------------------------------------------
> Guido Hooiveld, PhD
> Nutrition, Metabolism & Genomics Group
> Division of Human Nutrition
> Wageningen University
> Biotechnion, Bomenweg 2
> NL-6703 HD Wageningen
> the Netherlands
> tel: (+)31 317 485788
> fax: (+)31 317 483342
> email:      guido.hooiveld at wur.nl
> internet:   http://nutrigene.4t.com
> http://scholar.google.com/citations?user=qFHaMnoAAAAJ
> http://www.researcherid.com/rid/F-4912-2010

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list