[BioC] DESeqs two-factor two-level, interaction is interested

Shucong Li lishucongnn at gmail.com
Thu Jun 26 22:32:49 CEST 2014


Hello Michael, 

I tried to use the latest version of DESeq2 (1.5.20) and phyloseq (1.9.9) to analyze an illumina sequencing data set.  However, it didn't work and I got the error:

> library(DESeq2)

> phyloseq.obj <- seq_all
> dds <- phyloseq_to_deseq2(phyloseq.obj,design= ~  Treatment*Day)
converting counts to integer mode
> dds <- DESeq(dds, test = "Wald", fitType = "parametric",betaPrior=FALSE)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) : 
  subscript is a NSBS object that is incompatible
  with the current subsetting operation


I then tried use DESeq2(1.4.5) and phyloseq (1.8.2) to run the same codes, it worked without any issue.  I also tried the combination of  DESeq2 (1.4.5) +  phyloseq (1.9.9), DESeq2 (1.5.20) and phyloseq (1.8.2), they didn't work either and I go the same message  (Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) ).



Here is the output of sessionInfo()

R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
 [1] parallel  grid      splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq2_1.5.20             ecodist_1.2.9             Biostrings_2.32.0         doParallel_1.0.8          foreach_1.4.2            
 [6] iterators_1.0.7           metagenomeSeq_1.6.0       gplots_2.14.0             limma_3.20.1              Biobase_2.24.0           
[11] GenomicRanges_1.16.3      GenomeInfoDb_1.0.2        RcppArmadillo_0.4.300.8.0 Rcpp_0.11.2               XVector_0.4.0            
[16] IRanges_1.22.6            BiocGenerics_0.11.2       locfit_1.5-9.1            phangorn_1.99-7           genefilter_1.46.1        
[21] adephylo_1.1-6            scatterplot3d_0.3-35      analogue_0.12-0           rgl_0.93.996              princurve_1.1-12         
[26] labdsv_1.6-1              mgcv_1.7-29               indicspecies_1.7.1        biom_0.3.13               ggplot2_1.0.0            
[31] reshape2_1.4              plyr_1.8.1                phyloseq_1.9.9            pamr_1.54.1               cluster_1.15.2           
[36] survival_2.37-7           vegan_2.0-10              lattice_0.20-29           permute_0.8-3             RColorBrewer_1.0-5       
[41] matrixStats_0.10.0        MASS_7.3-33               ape_3.1-2                 ade4_1.6-2                nlme_3.1-117             

loaded via a namespace (and not attached):
 [1] adegenet_1.4-2       annotate_1.42.0      AnnotationDbi_1.26.0 bitops_1.0-6         brglm_0.5-9          caTools_1.17         codetools_0.2-8     
 [8] colorspace_1.2-4     data.table_1.9.2     DBI_0.2-7            digest_0.6.4         fastmatch_1.0-4      gdata_2.13.3         geneplotter_1.42.0  
[15] gtable_0.1.2         gtools_3.4.1         htmltools_0.2.4      httpuv_1.3.0         igraph_0.7.0         KernSmooth_2.23-12   Matrix_1.1-4        
[22] multtest_2.20.0      munsell_0.4.2        phylobase_0.6.8      proto_0.3-10         R.methodsS3_1.6.1    RJSONIO_1.2-0.2      RSQLite_0.11.4      
[29] S4Vectors_0.0.8      scales_0.2.4         shiny_0.10.0         stats4_3.1.0         stringr_0.6.2        tools_3.1.0          XML_3.98-1.1        
[36] xtable_1.7-3         zlibbioc_1.10.0     


Shucong Li
lishucongnn at gmail.com



On May 17, 2014, at 11:32 PM, Shucong Li <lishucongnn at gmail.com> wrote:

> Hi Mike,
> 
> Thank you.  With your help, I finally figured it out, sort of :-)
> 
> Enjoy the rest of your weekend.
> 
> Shawn
> On May 17, 2014, at 11:07 AM, Michael Love <michaelisaiahlove at gmail.com> wrote:
> 
>> hi Shawn,
>> 
>> Let's keep all replies on the Bioconductor mailing list, so other
>> users can follow the thread.
>> 
>> 
>> On Sat, May 17, 2014 at 11:32 AM, Shucong Li <lishucongnn at gmail.com> wrote:
>>> Hello Mike,
>>> 
>>> Thank you for reply on weekend.
>>> 
>>> Sorry for not make a part of my question properly. What I really meant is to compare :
>>> 
>>> "Factor A level 1" vs "Factor A  level 2"  within   "Fact B level 1"
>>> "Factor A level 1" vs "Factor A  level 2"  within   "Fact B level 2"
>> 
>> I don't understand what you mean by these comparisons, specifically
>> the 'within' part.
>> 
>> There are four terms in this model. I will write in the log2 scale, so
>> terms are additive (this is multiplication on the scale of counts):
>> 
>> log2(counts) = intercept + treatmenttreat + dayB + dayB.treatmenttreat
>> 
>> the treatmenttreat term is the difference between treatment and
>> control for day A
>> 
>> the dayB term is the difference between day B over day A for the control group
>> 
>> the final term accounts for the case that the (day B and treatment)
>> group is not merely the sum of the previous two terms.
>> 
>> If you want to compare treatment vs control for day B, that term is
>> the treatmenttreat term plus the dayB.treatmenttreat term. This can be
>> pulled out easily with a numeric contrast (see the help for ?results
>> for more details).
>> 
>> results(dds, contrast=c(0,1,0,1))
>> 
>> where the numbers correspond to the order in resultsNames(dds)
>> 
>> I'd recommend looking over some introductory material to interaction
>> models (e.g. http://en.wikipedia.org/wiki/Interaction_(statistics) ).
>> 
>> Mike
>> 
>>> 
>>> Is it possible to do so?  I checked all vignettes and manual of DESeq2 and still could not figure this out.
>>> 
>>> Shawn
>>> 
>>> On May 17, 2014, at 9:58 AM, Michael Love <michaelisaiahlove at gmail.com> wrote:
>>> 
>>>> hi Shawn,
>>>> 
>>>> 
>>>> On Sat, May 17, 2014 at 2:00 AM, Shawn [guest] <guest at bioconductor.org> wrote:
>>>>> Hello Mike,
>>>>> 
>>>>> Please allow me ask a basic question.  What does 'log2FoldChange' in the results of  DESeq2 analysis really mean for the interaction of a two-factor two-level design?
>>>> 
>>>> The meaning is the same as for other linear models. The interaction
>>>> term for the generalized linear model tests if the effect of both day
>>>> and treatment is not simply multiplicative (or additive in the log2
>>>> space). If this term is significantly non-zero (the default test in
>>>> DESeq2), then being in the group: (day=B and treatment=treat) is not
>>>> simply the product of the day=B fold change and the treatment=treat
>>>> fold change.
>>>> 
>>>> 
>>>>> Is it possible to compare 'Factor A level 1' to ' Factor A level 2' or other similar comparison?
>>>> 
>>>> For example, the day B over A effect:
>>>> 
>>>> results(dds, contrast=c("day","B","A"))
>>>> 
>>>> or equivalently
>>>> 
>>>> results(dds, name="day_B_vs_A")
>>>> 
>>>> Check the help for ?results for more examples.
>>>> 
>>>> Mike
>>>> 
>>>>> 
>>>>> Here are the part of codes I used:
>>>>> 
>>>>> 
>>>>> dds <- phyloseq_to_deseq2(phyloseq.obj, design=~ Treatment*Day)
>>>>> 
>>>>> colData(dds)$Treatment<- factor(colData(dds)$Treatment,levels=c("Control","Treat"));
>>>>> colData(dds)$Day<- factor(colData(dds)$Day,levels=c("A","B"))
>>>>> 
>>>>> dds$Treatment<- relevel(dds$Treatment, "Control")
>>>>> dds$Day<- relevel(dds$Day, "A")
>>>>> 
>>>>> dds <- DESeq(dds, fitType="local",betaPrior=FALSE)
>>>>> 
>>>>> Shawn
>>>>> 
>>>>> 
>>>>> -- output of sessionInfo():
>>>>> 
>>>>> sessionInfo()
>>>>> R version 3.1.0 (2014-04-10)
>>>>> Platform: x86_64-apple-darwin13.1.0 (64-bit)
>>>>> 
>>>>> locale:
>>>>> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
>>>>> 
>>>>> attached base packages:
>>>>> [1] parallel  grid      splines   stats     graphics  grDevices utils     datasets  methods   base
>>>>> 
>>>>> other attached packages:
>>>>> [1] ecodist_1.2.9           Biostrings_2.32.0       doParallel_1.0.8        foreach_1.4.2
>>>>> [5] iterators_1.0.7         metagenomeSeq_1.6.0     gplots_2.13.0           limma_3.20.1
>>>>> [9] Biobase_2.24.0          DESeq2_1.4.5            GenomicRanges_1.16.3    GenomeInfoDb_1.0.2
>>>>> [13] RcppArmadillo_0.4.300.0 Rcpp_0.11.1             XVector_0.4.0           IRanges_1.22.6
>>>>> [17] BiocGenerics_0.10.0     locfit_1.5-9.1          phangorn_1.99-7         genefilter_1.46.1
>>>>> [21] adephylo_1.1-6          scatterplot3d_0.3-35    analogue_0.12-0         rgl_0.93.996
>>>>> [25] princurve_1.1-12        labdsv_1.6-1            mgcv_1.7-29             indicspecies_1.7.1
>>>>> [29] biom_0.3.13             ggplot2_0.9.3.1         plyr_1.8.1              phyloseq_1.9.2
>>>>> [33] pamr_1.54.1             cluster_1.15.2          survival_2.37-7         vegan_2.0-10
>>>>> [37] lattice_0.20-29         permute_0.8-3           RColorBrewer_1.0-5      matrixStats_0.8.14
>>>>> [41] MASS_7.3-33             ape_3.1-1               ade4_1.6-2              nlme_3.1-117
>>>>> 
>>>>> loaded via a namespace (and not attached):
>>>>> [1] adegenet_1.4-1       annotate_1.42.0      AnnotationDbi_1.26.0 bitops_1.0-6
>>>>> [5] brglm_0.5-9          caTools_1.17         codetools_0.2-8      colorspace_1.2-4
>>>>> [9] data.table_1.9.2     DBI_0.2-7            digest_0.6.4         fastmatch_1.0-4
>>>>> [13] gdata_2.13.3         geneplotter_1.42.0   gtable_0.1.2         gtools_3.4.0
>>>>> [17] httpuv_1.3.0         igraph_0.7.0         KernSmooth_2.23-12   Matrix_1.1-3
>>>>> [21] multtest_2.20.0      munsell_0.4.2        phylobase_0.6.8      proto_0.3-10
>>>>> [25] R.methodsS3_1.6.1    reshape2_1.4         RJSONIO_1.2-0.2      RSQLite_0.11.4
>>>>> [29] scales_0.2.4         shiny_0.9.1          stats4_3.1.0         stringr_0.6.2
>>>>> [33] tools_3.1.0          XML_3.98-1.1         xtable_1.7-3         zlibbioc_1.10.0
>>>>> 
>>>>> 
>>>>> --
>>>>> Sent via the guest posting facility at bioconductor.org.
>>> 
> 



More information about the Bioconductor mailing list