[BioC] DEXSeq: Problems trying to reproduce the example of the vignette

Wolfgang Huber whuber at embl.de
Sun Jun 16 00:57:25 CEST 2013


Dear Simone

thank you for your feedback! Your problem with 'plotDispEsts' is an oversight of ours in the last release of the DEXSeq package: the function 'plotDispEsts' in the DEXSeq vignette is defined 'on the fly' in the global workspace by the vignette (see http://bioconductor.org/packages/release/bioc/vignettes/DEXSeq/inst/doc/DEXSeq.R chunk 10). What happens in your case is that you probably didn't run this, and picked up the DESeq::plotDispEsts function instead, which throws the error. The sane solution, of course, would be for us to define this as a generic function (e.g. in BiocGenerics) and dispatch on argument type. This will be the case in the next release. Till then, as a workaround, please run the R code in the vignette from the .R file provided with the package, rather than copy-pasting it from the PDF.

I am not sure I understand your problem with 'plotMA'; what do you get from 
  table(res$padjust, useNA="always")
Perhaps it has to do with NA in res$padjust, which would lead to some points having NA colours (and thus not being shown) only when you use below-mentioned ifelse-expression.

As for your third question, isn't this solved by just adding another covariate for 'patient' which will absorb the patient effects?

Hope this helps, best wishes
	Wolfgang


On Jun 12, 2013, at 3:30 pm, Simone <enomis.bioc at gmail.com> wrote:

> Hi!
> 
> As the subject says, I've got some problems with DEXSeq. For the
> moment, I am just trying to reproduce the example of the Vignette. The
> first error that occurrs is when I try to plot the per-gene disperion
> estimates using plotDispEsts:
> 
>> library("DEXSeq")
>> library("pasilla")
>> library("parallel")
>> data("pasillaExons", package="pasilla")
> 
>> pasillaExons <- estimateSizeFactors(pasillaExons)
>> pasillaExons <- estimateDispersions(pasillaExons, nCores=3)
> Estimating Cox-Reid exon dispersion estimates using 3 cores. (Progress
> report: one dot per 100 genes)
>> pasillaExons <- fitDispersionFunction(pasillaExons)
> 
>> plotDispEsts(pasillaExons)
> Error: is(cds, "CountDataSet") is not TRUE
>> traceback()
> 4: stop(sprintf(ngettext(length(r), "%s is not TRUE", "%s are not all TRUE"),
>       ch), call. = FALSE, domain = NA)
> 3: stopifnot(is(cds, "CountDataSet"))
> 2: fitInfo(cds, name = name)
> 1: plotDispEsts(pasillaExons)
> 
> What am I doing wrong here? In general I work with the vignette which
> comes with the package (last revision 2013-02-27), but I found a piece
> of code in an older version which seems to work however:
> 
>> meanvalues <- rowMeans(counts(pasillaExons))
>> plot(meanvalues, fData(pasillaExons)$dispBeforeSharing, log = "xy", main = "mean vs CR dispersion")
>> x <- 0.01:max(meanvalues)
>> y <- pasillaExons at dispFitCoefs[1] + pasillaExons at dispFitCoefs[2]/x
>> lines(x, y, col = "red")
> 
> The second problem I have got is the MA plot after testing for
> differential exon usage:
> 
>> pasillaExons <- testForDEU(pasillaExons, nCores=3)
> Testing for differential exon usage using 3 cores. (Progress report:
> one dot per 100 genes)
>> pasillaExons <- estimatelog2FoldChanges(pasillaExons)
>> res <- DEUresultTable(pasillaExons)
>> plotMA(with(res, data.frame(baseMean=meanBase, log2FoldChange=log2fold, padj=padjust)), ylim=c(-4,4), cex=0.8)
> 
> This works fine. But I wanted to change the threshold for the color
> highlighting to "padjust < 0.05" (although this wouldn't change
> anything in this case, I just wanted to try). So I did the following:
> 
>> plotMA(with(res, data.frame(baseMean=meanBase, log2FoldChange=log2fold, padj=padjust)), ylim=c(-4,4), cex=0.8, col=ifelse(res$padj >= 0.05, "gray32", "red3"))
> 
> But whenever I use the col parameter with whatever value (also with
> simply one color), the output is different and more data points than
> before get displayed. I do not really understand this effect.
> 
> I worked with R 2.15 and the newest version of DEXSeq but updated
> everything yesterday (R and all packages, see sessionInfo below) to
> see if this changes something, but it didn't. Of course I can just
> copy the function from the source and define my own one changing the
> threshold, but I'd prefer to understand what's going on here.
> 
> An another small question regarding this plot: Reading the vignette I
> could not find out what the triangles mean which sometimes appear. Do
> they indicate outliers (lying outside the plotting area), or is this
> something completely different?
> 
> And my last question: I read in
> http://seqanswers.com/forums/showthread.php?t=13299 that it should be
> possible with DEXSeq to analyze paired samples as well. In my case I
> have for every individual patient two (or more) different celltypes
> which I would like to analyze, so I would have to use a paired test
> for differential exon usage. In the thread mentioned it was said that
> there is more information about how to do this in the vignette
> available, but I cannot find it. How would I have to add the patient
> information to account for paired samples in DEXSeq?
> 
> Sorry for asking that many (pretty ignorant) questions, it's the first
> time I am trying to use a package dealing with RNAseq-data, and I hope
> someone can help.
> 
> Best whishes,
> Simone
> 
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-pc-linux-gnu (64-bit)
> 
> locale:
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> LC_TIME=en_US.UTF-8
> [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C                 LC_NAME=C
> LC_ADDRESS=C
> [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8
> LC_IDENTIFICATION=C
> 
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets
> methods   base
> 
> other attached packages:
> [1] pasilla_0.2.16     DESeq_1.12.0       lattice_0.20-15
> locfit_1.5-9.1     DEXSeq_1.6.0
> [6] Biobase_2.20.0     BiocGenerics_0.6.0
> 
> loaded via a namespace (and not attached):
> [1] annotate_1.38.0      AnnotationDbi_1.22.6 biomaRt_2.16.0
> Biostrings_2.28.0    bitops_1.0-5
> [6] DBI_0.2-7            genefilter_1.42.0    geneplotter_1.38.0
> GenomicRanges_1.12.4 grid_3.0.1
> [11] hwriter_1.3          IRanges_1.18.1       RColorBrewer_1.0-5
> RCurl_1.95-4.1       Rsamtools_1.12.3
> [16] RSQLite_0.11.4       splines_3.0.1        statmod_1.4.17
> stats4_3.0.1         stringr_0.6.2
> [21] survival_2.37-4      tools_3.0.1          XML_3.96-1.1
> xtable_1.7-1         zlibbioc_1.6.0
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list