[BioC] DESeq2 multiple factors nested design

Isobel [guest] guest at bioconductor.org
Thu Mar 27 18:41:10 CET 2014

Dear all,

I have data collected from an experiment looking at expression differences related to diet and species. I have samples from six species, and for each species I have five paired clones, each pair exposed to one of two diets. The clones are genetically identical but were raised independently (see below for data structure – first three species only).

species	diet	clone
sp1		A	c1
sp1		B	c1
sp1		A	c2
sp1		B	c2
sp1		A	c3
sp1		B	c3
sp1		A	c4
sp1		B	c4
sp1		A	c5
sp1		B	c5
sp2		A	c6
sp2		B	c6
sp2		A	c7
sp2		B	c7
sp2		A	c8
sp2		B	c8
sp2		A	c9
sp2		B	c9
sp2		A	c10
sp2		B	c10
sp3		A	c11
sp3		B	c11
sp3		A	c12
sp3		B	c12
sp3		A	c13
sp3		B	c13
sp3		A	c14
sp3		B	c14
sp3		A	c15
sp3		B	c15

I am interested in genes differentially expressed between diet A and diet B, in genes differentially expressed between species, and in particular I am interested in identifying genes where differential expression between diets varies in relation to species (i.e. the diet-species interaction).

My problem is in finding a way of accounting for clone within the model in DESeq. In an ideal world I would like to incorporate “clone” as a factor in the model, to account for differences between samples resulting from “clone”, so in the first instance I’d like to compare the following:

~ clone + species + diet + species:diet
~ clone + species + diet

However, I think because “clone” is nested within “species” so any given level of clone is only present within one biotype, simply putting clone into the design formula of DESeq2 does not work, any design including both clone and species, for example: 

dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData,design = ~ clone+species+diet)

returns the error “invalid class “DESeqDataSet” object: the model matrix is not full rank, i.e. one or more variables in the design formula are linear combinations of the others”.

Am I interpreting this error correctly? And if so, is there a way I could stipulate that clone is nested within species in DESeq?

Many thanks for your help.

 -- output of sessionInfo(): 

R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Biobase_2.22.0            DESeq2_1.2.10             RcppArmadillo_0. Rcpp_0.11.1              
[5] GenomicRanges_1.14.4      XVector_0.2.0             IRanges_1.20.7            BiocGenerics_0.8.0       

loaded via a namespace (and not attached):
 [1] annotate_1.40.1      AnnotationDbi_1.24.0 DBI_0.2-7            genefilter_1.44.0    grid_3.0.2          
 [6] lattice_0.20-27      locfit_1.5-9.1       RColorBrewer_1.0-5   RSQLite_0.11.4       splines_3.0.2       
[11] stats4_3.0.2         survival_2.37-7      XML_3.95-0.2         xtable_1.7-3        

Sent via the guest posting facility at bioconductor.org.

More information about the Bioconductor mailing list