# [BioC] DESeq2 16S copy number correction

Michael Love michaelisaiahlove at gmail.com
Fri Apr 25 01:55:25 CEST 2014

```hi Manoeli,

I think I follow your question, and I've been meaning to put in a
function to help in this case, but I didn't make it in time for the
latest release.

Below is some code for a toy example, tell me if this resembles your problem.

Suppose we want to estimate the size factors "sf". Here are the true values:

> sf <- c(.5,1,1,2)

And additionally, we have a matrix of factors which will contribute to
the counts, so I am thinking this is analogous to your copy number
information. The matrix is OTU x samples.

> m <- matrix(c(1,10,1,10,1,rep(1,3*5)),ncol=4)

Here we are encoding that there is a copy number of x10 for the first
sample and for the 2nd and 4th OTU.

> m
[,1] [,2] [,3] [,4]
[1,]    1    1    1    1
[2,]   10    1    1    1
[3,]    1    1    1    1
[4,]   10    1    1    1
[5,]    1    1    1    1

I generate counts using these size factors and the matrix m:

> (k <- matrix(rpois(20,100*rep(sf,each=5)*m),ncol=4))
[,1] [,2] [,3] [,4]
[1,]   45  106  112  221
[2,]  478  103   91  199
[3,]   40  116   89  190
[4,]  497   81  102  183
[5,]   55  112   79  192

We get back the size factors estimates on the counts normalized by
dividing out m:

> (sf.hat <- estimateSizeFactorsForMatrix(k/m))
[1] 0.4919127 1.0599792 0.9456373 2.0187763

Then we can build a matrix of normalization factors:

> (nf <- rep(sf.hat, each=5) * m)
[,1]     [,2]      [,3]     [,4]
[1,] 0.4919127 1.059979 0.9456373 2.018776
[2,] 4.9191266 1.059979 0.9456373 2.018776
[3,] 0.4919127 1.059979 0.9456373 2.018776
[4,] 4.9191266 1.059979 0.9456373 2.018776
[5,] 0.4919127 1.059979 0.9456373 2.018776

then normalized counts are k divided by the normalization factors:

> k / nf
[,1]      [,2]      [,3]      [,4]
[1,]  91.47965 100.00197 118.43864 109.47226
[2,]  97.17172  97.17172  96.23140  98.57457
[3,]  81.31525 109.43611  94.11642  94.11642
[4,] 101.03420  76.41660 107.86376  90.64897
[5,] 111.80847 105.66245  83.54154  95.10712

this is what you would get by:

normalizationFactors(dds) <- nf
counts(dds, normalized=TRUE)

-Mike

On Thu, Apr 24, 2014 at 11:13 AM, Manoeli Lupatini <mlupatini at gmail.com> wrote:
> Hi,
>
>
> I have counts of DNA for 16S with different library sizes and want to use
> DESeq2 to normalize the counts. However, I used Picrust to correct the 16S
> copy number for OTUs and the number generated by this correction are not
> integers (but decimals). Can I used DESeq2 to normalized my count data
> (using size factor) obtained by this 16S number correction considering that
> the DESeq2 was developed based in counts and not in counts corrected by 16S
> copy number?
>
>
> Thanks,
>
>
> Manoeli
>
> --
>
> Manoeli Lupatini
> PhD candidate
> Netherlands Institute of Ecology (NIOO/KNAW)
> Wageningen, The Netherlands
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

```