[BioC] normalizing RNAseq with batch/block-level bulk DE

Thu Oct 25 21:35:48 CEST 2012

Dear Aaron

DESeq's variance stabilising transformation does not do normalisation. 

By "deal with large scale differences in mean 'baseline' expression across experimental blocks" do you mean that you are considering a comparison between different biological conditions where you expect that a lot of gene expression levels are changed? The best here is to work with a set of negative control genes: these can either be spike-ins or a category of genes from which you know that they shouldn't change too much. Then, call 'estimateSizeFactors' only on the data of these, but apply to all data (by using the assignment function 'sizeFactors<-').

	Best wishes
	Wolfgang.

Il giorno Oct 25, 2012, alle ore 5:43 PM, Aaron Mackey <amackey at virginia.edu> ha scritto:

> Is VST-normalization (a la DESeq) considered the right way to deal with
> large scale differences in mean "baseline" expression across experimental
> blocks?  Is there a normalization method that can take into account the
> design matrix (or at least the batch/block columns)?  I don't want to
> remove the batch/block effects, but TMM and friends all assume
> near-constant expression across the design, which is violated by our
> (nuisance) block-level differences in composition.  We see this when we
> compare edgeR TMM-normalized log(cpm) to qRT-PCR data; the
> TMM-normalization has smoothed out the block differences that the Ct values
> still exhibit (though cpm and Ct are still strongly correlated, there is a
> Ct "shift" for each different block that is not seen in the cpm).
> 
> Thanks in advance for any insights/thoughts on the issue,
> -Aaron
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor