[BioC] Voom Normalization and negative numbers

Tue Apr 1 19:19:12 CEST 2014

Hi Micheal,
As described in the help file (?voom) the $E component of the output object contains a "numeric matrix of normalized expression values on the log2 scale".

So negative values indicate low levels of (normalized) expression. 

Even though your filtering step filters out genes with 14 or less samples (~ half the samples) with cpm >10 you could easily get low levels of expression for any particular sample. 

Imagine, for a given gene, that half you samples have cpm >10 and the other half have cpm=0.1. You would expect to see the later half with negative normalized expression levels.

Wade

________________________________________
From: Michael Breen [breenbioinformatics at gmail.com]
Sent: Tuesday, April 01, 2014 4:12 AM
To: bioconductor at r-project.org
Subject: [BioC] Voom Normalization and negative numbers

Hi all,

We are applying Voom normalization to RNA-Seq Counts with the following
code:

library(edgeR)
count <- read.delim("Counts.txt", check.names=FALSE, stringsAsFactors=FALSE)
targets <- read.delim("Targets.txt", check.names=FALSE,
stringsAsFactors=FALSE)

#filter
y <- DGEList(counts=rawdata[,2:31], genes=rawdata[,1:1])
keep <- rowSums(cpm(y)>10) >= 15
y <- y[keep,]
dim (y)

#norm
y <- calcNormFactors(y)

#voom
VST <- voom(y,design=NULL,plot=TRUE)
voom_matrix <- cbind(VST$genes, VST$E)
write.table (voom_matrix, "VOOM_Matrix.txt", sep="\t")

However, I find that even after this filtering step, I am finding negative
expression values within my voom normalized matrix. Why is this?

Michael

--
M.S. Breen
PhD, Bioinformatics and Genomics
Clinical and Experimental Sciences
Univ. of Southampton

        [[alternative HTML version deleted]]