[BioC] DEXSeq continous variable in model
Alex Gutteridge
alexg at ruggedtextile.com
Wed Jul 30 11:13:17 CEST 2014
On 29-07-2014 10:00, Alex Gutteridge wrote:
> Hi,
>
> I have a time course RNASeq experiment and I'd like to detect DEU
> between early and late stages. I am trying to use 'Age' as a
> continuous variable in my design, but I'm getting an error which I
> suspect is related to this e.g:
>
>> summary(sample.data.ipsc[,1:6])
> Sample CellType Subtype Donor Age
> Length:28 iPSC:28 iPSC :28 4555 :28 Min. : 2.000
> Class :character HBR : 0 D4721 : 0 1st Qu.: 4.000
> Mode :character L2 : 0 D4749 : 0 Median : 8.000
> L3 : 0 D6002 : 0 Mean : 7.821
> L4 : 0 D6005 : 0 3rd Qu.:10.500
> L5 : 0 D6008 : 0 Max. :14.000
> (Other): 0 (Other): 0
> Passage
> 42:3
> 48:7
> 49:5
> 52:6
> 56:7
>
>> dxd = DEXSeqDataSetFromHTSeq(countFiles,
> sampleData=sample.data.ipsc,
> design= ~ sample + exon + Age:exon,
> flattenedfile=flattenedFile)
>
>> BPPARAM = MulticoreParam(workers=24)
>
>> dxd = estimateSizeFactors(dxd)
>> dxd = estimateDispersions(dxd, BPPARAM=BPPARAM)
>> dxd = testForDEU(dxd, BPPARAM=BPPARAM)
>> dxd = estimateExonFoldChanges(dxd, fitExpToVar="Age", BPPARAM=BPPARAM)
> Error: 8346 errors; first error:
> Error in FUN(1:3[[1L]], ...): Non-factor in model frame
>
> For more information, use bplasterror(). To resume calculation, re-call
> the function and set the argument 'BPRESUME' to TRUE or wrap the
> previous call in bpresume().
>
> First traceback:
> 31: estimateExonFoldChanges(dxd, fitExpToVar = "Age", BPPARAM =
> BPPARAM)
> 30: bplapply(testablegenes, geteffects, BPPARAM = BPPARAM)
> 29: bplapply(testablegenes, geteffects, BPPARAM = BPPARAM)
> 28: mclapply(X = X, FUN = FUN, ..., mc.set.seed = BPPARAM$setSeed,
> mc.silent = !BPPARAM$verbose, mc.cores = bpworkers(BPPARAM),
> mc.cleanup = if (BPPARAM$cleanup) BPPARAM$cleanupSignal else
> FALSE)
> 27: lapply(seq_len(cores), inner.do)
> 26: lapply(seq_len(cores), inner.do)
> 25: FUN(1:24[[1L]], ...)
> 24: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))
> 23: parallel:::sendMaster(...)
> 22: try(lapply(X = S, FUN = FUN, ...), silent = TRUE)
> 21: tryCatch(expr, error = function(e)
>
> Can DEXSeq accept a model with a continuous variable? Does it make
> sense to do so? (I do the same thing with DESeq2 to detected DE and it
> works fine). Is this error due to that? Note that all the other steps
> seem to run fine and I can get results (though I don't have many
> significant hits - not sure if this is related or not). If not what is
> best practice? Just split the data set into 'early' and 'late' samples
> and run that as a factor?
>
> Alex Gutteridge
To sort of answer my own question: cutting the age vector into two bins
(early <8 weeks and late >8 weeks) and using that factor seems to work
and give many significant hits:
> sample.data.ipsc$AgeBin = cut(sample.data.ipsc$Age,2)
> dxd = DEXSeqDataSetFromHTSeq(countFiles,
sampleData=sample.data.ipsc,
design= ~ sample + exon + AgeBin:exon,
flattenedfile=flattenedFile)
> BPPARAM = MulticoreParam(workers=12)
> dxd = estimateSizeFactors(dxd)
> dxd = estimateDispersions(dxd, BPPARAM=BPPARAM)
> dxd = testForDEU(dxd, BPPARAM=BPPARAM)
Which makes me think that DEXSeq wasn't doing what I hoped with the
previous analysis (quite apart from the error). Is there any way to
rejig the design to cope with a continuous variable as opposed to a
factor? I *think* the underlying biological question makes sense: Are
there exons showing DEU that correlates with time?
Alex Gutteridge
More information about the Bioconductor
mailing list