[BioC] DESeq2 16S copy number correction
Michael Love
michaelisaiahlove at gmail.com
Fri Apr 25 01:55:25 CEST 2014
hi Manoeli,
I think I follow your question, and I've been meaning to put in a
function to help in this case, but I didn't make it in time for the
latest release.
Below is some code for a toy example, tell me if this resembles your problem.
Suppose we want to estimate the size factors "sf". Here are the true values:
> sf <- c(.5,1,1,2)
And additionally, we have a matrix of factors which will contribute to
the counts, so I am thinking this is analogous to your copy number
information. The matrix is OTU x samples.
> m <- matrix(c(1,10,1,10,1,rep(1,3*5)),ncol=4)
Here we are encoding that there is a copy number of x10 for the first
sample and for the 2nd and 4th OTU.
> m
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 10 1 1 1
[3,] 1 1 1 1
[4,] 10 1 1 1
[5,] 1 1 1 1
I generate counts using these size factors and the matrix m:
> (k <- matrix(rpois(20,100*rep(sf,each=5)*m),ncol=4))
[,1] [,2] [,3] [,4]
[1,] 45 106 112 221
[2,] 478 103 91 199
[3,] 40 116 89 190
[4,] 497 81 102 183
[5,] 55 112 79 192
We get back the size factors estimates on the counts normalized by
dividing out m:
> (sf.hat <- estimateSizeFactorsForMatrix(k/m))
[1] 0.4919127 1.0599792 0.9456373 2.0187763
Then we can build a matrix of normalization factors:
> (nf <- rep(sf.hat, each=5) * m)
[,1] [,2] [,3] [,4]
[1,] 0.4919127 1.059979 0.9456373 2.018776
[2,] 4.9191266 1.059979 0.9456373 2.018776
[3,] 0.4919127 1.059979 0.9456373 2.018776
[4,] 4.9191266 1.059979 0.9456373 2.018776
[5,] 0.4919127 1.059979 0.9456373 2.018776
then normalized counts are k divided by the normalization factors:
> k / nf
[,1] [,2] [,3] [,4]
[1,] 91.47965 100.00197 118.43864 109.47226
[2,] 97.17172 97.17172 96.23140 98.57457
[3,] 81.31525 109.43611 94.11642 94.11642
[4,] 101.03420 76.41660 107.86376 90.64897
[5,] 111.80847 105.66245 83.54154 95.10712
this is what you would get by:
normalizationFactors(dds) <- nf
counts(dds, normalized=TRUE)
-Mike
On Thu, Apr 24, 2014 at 11:13 AM, Manoeli Lupatini <mlupatini at gmail.com> wrote:
> Hi,
>
>
> I have counts of DNA for 16S with different library sizes and want to use
> DESeq2 to normalize the counts. However, I used Picrust to correct the 16S
> copy number for OTUs and the number generated by this correction are not
> integers (but decimals). Can I used DESeq2 to normalized my count data
> (using size factor) obtained by this 16S number correction considering that
> the DESeq2 was developed based in counts and not in counts corrected by 16S
> copy number?
>
>
> Thanks,
>
>
> Manoeli
>
> --
>
> Manoeli Lupatini
> PhD candidate
> Netherlands Institute of Ecology (NIOO/KNAW)
> Wageningen, The Netherlands
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list