[BioC] DEXSeq continous variable in model

Alex Gutteridge alexg at ruggedtextile.com
Tue Jul 29 11:00:39 CEST 2014


Hi,

I have a time course RNASeq experiment and I'd like to detect DEU 
between early and late stages. I am trying to use 'Age' as a continuous 
variable in my design, but I'm getting an error which I suspect is 
related to this e.g:

> summary(sample.data.ipsc[,1:6])
     Sample          CellType     Subtype       Donor         Age
  Length:28          iPSC:28   iPSC   :28   4555   :28   Min.   : 2.000
  Class :character             HBR    : 0   D4721  : 0   1st Qu.: 4.000
  Mode  :character             L2     : 0   D4749  : 0   Median : 8.000
                               L3     : 0   D6002  : 0   Mean   : 7.821
                               L4     : 0   D6005  : 0   3rd Qu.:10.500
                               L5     : 0   D6008  : 0   Max.   :14.000
                               (Other): 0   (Other): 0
  Passage
  42:3
  48:7
  49:5
  52:6
  56:7

> dxd = DEXSeqDataSetFromHTSeq(countFiles,
                                sampleData=sample.data.ipsc,
                                design= ~ sample + exon + Age:exon,
                                flattenedfile=flattenedFile)

> BPPARAM = MulticoreParam(workers=24)

> dxd = estimateSizeFactors(dxd)
> dxd = estimateDispersions(dxd, BPPARAM=BPPARAM)
> dxd = testForDEU(dxd, BPPARAM=BPPARAM)
> dxd = estimateExonFoldChanges(dxd, fitExpToVar="Age", BPPARAM=BPPARAM)
Error: 8346 errors; first error:
   Error in FUN(1:3[[1L]], ...): Non-factor in model frame

For more information, use bplasterror(). To resume calculation, re-call
   the function and set the argument 'BPRESUME' to TRUE or wrap the
   previous call in bpresume().

First traceback:
   31: estimateExonFoldChanges(dxd, fitExpToVar = "Age", BPPARAM = 
BPPARAM)
   30: bplapply(testablegenes, geteffects, BPPARAM = BPPARAM)
   29: bplapply(testablegenes, geteffects, BPPARAM = BPPARAM)
   28: mclapply(X = X, FUN = FUN, ..., mc.set.seed = BPPARAM$setSeed,
           mc.silent = !BPPARAM$verbose, mc.cores = bpworkers(BPPARAM),
           mc.cleanup = if (BPPARAM$cleanup) BPPARAM$cleanupSignal else 
FALSE)
   27: lapply(seq_len(cores), inner.do)
   26: lapply(seq_len(cores), inner.do)
   25: FUN(1:24[[1L]], ...)
   24: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))
   23: parallel:::sendMaster(...)
   22: try(lapply(X = S, FUN = FUN, ...), silent = TRUE)
   21: tryCatch(expr, error = function(e)

Can DEXSeq accept a model with a continuous variable? Does it make sense 
to do so? (I do the same thing with DESeq2 to detected DE and it works 
fine). Is this error due to that? Note that all the other steps seem to 
run fine and I can get results (though I don't have many significant 
hits - not sure if this is related or not). If not what is best 
practice? Just split the data set into 'early' and 'late' samples and 
run that as a factor?

Alex Gutteridge

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
  [1] stringr_0.6.2
  [2] DEXSeq_1.10.8
  [3] BiocParallel_0.6.1
  [4] reshape_0.8.5
  [5] ggplot2_1.0.0
  [6] matrixStats_0.10.0
  [7] statmod_1.4.20
  [8] pcaMethods_1.54.0
  [9] Homo.sapiens_1.1.2
[10] TxDb.Hsapiens.UCSC.hg19.knownGene_2.14.0
[11] org.Hs.eg.db_2.14.0
[12] GO.db_2.14.0
[13] RSQLite_0.11.4
[14] DBI_0.2-7
[15] OrganismDbi_1.6.0
[16] GenomicFeatures_1.16.2
[17] AnnotationDbi_1.26.0
[18] Biobase_2.24.0
[19] limma_3.20.8
[20] DESeq2_1.4.5
[21] RcppArmadillo_0.4.320.0
[22] Rcpp_0.11.2
[23] GenomicRanges_1.16.3
[24] GenomeInfoDb_1.0.2
[25] IRanges_1.22.9
[26] BiocGenerics_0.10.0
[27] gplots_2.14.1
[28] RColorBrewer_1.0-5

loaded via a namespace (and not attached):
  [1] annotate_1.42.1         BatchJobs_1.3           BBmisc_1.7
  [4] biomaRt_2.20.0          Biostrings_2.32.1       bitops_1.0-6
  [7] brew_1.0-6              BSgenome_1.32.0         caTools_1.17
[10] checkmate_1.2           codetools_0.2-8         colorspace_1.2-4
[13] digest_0.6.4            fail_1.2                foreach_1.4.2
[16] gdata_2.13.3            genefilter_1.46.1       geneplotter_1.42.0
[19] GenomicAlignments_1.0.3 graph_1.42.0            grid_3.1.0
[22] gtable_0.1.2            gtools_3.4.1            hwriter_1.3
[25] iterators_1.0.7         KernSmooth_2.23-12      labeling_0.2
[28] lattice_0.20-29         locfit_1.5-9.1          MASS_7.3-33
[31] munsell_0.4.2           plyr_1.8.1              proto_0.3-10
[34] RBGL_1.40.0             RCurl_1.95-4.1          reshape2_1.4
[37] R.methodsS3_1.6.1       Rsamtools_1.16.1        rtracklayer_1.24.2
[40] scales_0.2.4            sendmailR_1.1-2         splines_3.1.0
[43] stats4_3.1.0            survival_2.37-7         tools_3.1.0
[46] XML_3.98-1.1            xtable_1.7-3            XVector_0.4.0
[49] zlibbioc_1.10.0



More information about the Bioconductor mailing list