[BioC] DESeq2 basemean and log2foldchange

Steve Lianoglou lianoglou.steve at gene.com
Fri Aug 1 20:33:11 CEST 2014


Hi,

On Fri, Aug 1, 2014 at 11:20 AM, rob yang <nextgame at hotmail.com> wrote:
> Hello DESeq community,
>
> I wanted to ask how basemean and log2foldchange are calculated.
>
> 1) From mcol(mcol(dds_fit)), the basemean is denoted as "basemean over all rows". But I can't pin down what "all rows" this means.

I reckon it should really say "columns" instead of "rows", but they
use the "rowMeans" and "rowVars" functions, so ... perhaps some
mismatch in thinking.

Anyway, you got the right idea.

> For a single transcript, my manual mean calculation of all counts across all replicates, all conditions do not equal to the DESeq reported basemean for a single transcript.

You'd have to show us what your manual calculations are. Take a look
at the `DESeq2:::getBaseMeansAndVariances` function to see how they do
it. You will see that it's really just the rowMeans of all the counts,
ie:

baseMean = unname(rowMeans(counts(object, normalized = TRUE)))

> 2) log2foldchange seems to be changing with the contrasting variable. This is a little surprising since I'd expect log2foldchange to be a static number when comparing two conditions, and the contrasting variables only dictate testing for significance, ie., p-value.

What version of DESeq2 are you using? If I recall correctly, there was
an issue in changing logFC estimates when factors were releveled, but
I think this issue has been fixed (or highly mitigated, at least) in
recent versions -- if you are running the current version of DESeq2, I
think this problem should be gone..

HTH,
-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech



More information about the Bioconductor mailing list