[BioC] estimation of size factors in DESeq2 analysis
Michael Love
michaelisaiahlove at gmail.com
Wed Jul 2 15:11:35 CEST 2014
hi Assa,
I started to answer you at biostars, I think our messages crossed
https://www.biostars.org/p/105192/
On Wed, Jul 2, 2014 at 8:37 AM, Assa Yeroslaviz <frymor at gmail.com> wrote:
> Hi all,
>
> in relation to my mail from January this year, I followed Simon's advice to
> do my analyses in DESeq2 instead of DESeq.
>
> I am working on an RNASeq from c. elegans. I have mapped the data with the
> ensembl genome build WBcel215. I have ran tophat2 to map and featureCounts
> to counts the reads (both with the defaults parameters).
>
> I have two conditions, control and a knock-out with each three replica. Now
> I am trying to find differentially regulated genes between the two
> conditions using DESeq2.
>
> This is the script I am using to read my raw count table into DESeq2:
>
> featureCountTable <- read.table("featureCountTable_RawCounts.txt",
> sep="\t", quote=F)
>
> colData <- data.frame(row.names=names(featureCountTable), condition =
> c(rep("wt",3), rep("cpb3", 3)))
>
> cds <- DESeqDataSetFromMatrix (
> countData = featureCountTable,
> colData = colData,
> design = ~ condition
> )
>
> fit = DESeq(cds)
> res = results(fit)
>
> But I am getting the same problem with DESeq2 as I have got with DESeq.
> When I ran the DESeq command I get a warning:
> Warning messages:
> 1: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
> 2: step size truncated due to divergence
>
> So again I have tried to change the fitType.
> fit = DESeq(cds, fitType="local")
>
> Which than came back without any warnings.
> The two dispersion plots can be found here
> <http://s23.postimg.org/pvopnmtxj/DESesq2_local.png> (local fit) and here
> <http://s23.postimg.org/uk4pitj47/DESesq2_parametric.png>
> (default/parametric fit). The red line goes through the point-cloud in both
> cases (as Simon defined a good fit in the last communication, I wish it
> would have bin so easy :-) .
> In the local fit type there a more outliers and the right end of the slope
> is going up again. I am not sure whether or not this is a good thing or not.
>
> So, my question is - which of the two options is better?
> I understand, that in general the parametric (default) option is better,
> but here it gives me a warning, so that something in the fit calculations
> is not good.
>
> How can I understand theses plots?
>
> Thanks for the help
> Assa
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list