[BioC] Multifactor model design for DE analysis (DESeq2 & edgeR)

Tue Aug 19 10:01:00 CEST 2014

Hi all,

I am using DESeq2 and edgeR to perform DE analysis on paired samples on a dog cancer project.
Sorry if the question is redundant but I can’t find one very similar to my case.

I have been designing models with 2 factors: condition (control / tumor) and patient ID (to match the paired samples). I used the model '~sample_id + condition’ until now but I would like to add a third factor, the breed.
Is that then correct to use ‘~sample_id + breed + condition’ if my goal is to analyse the DE between control and tumor samples taking into account the individual variabilities (with the sample ID factor) and the breed variability (with the breed factor).

Here is an example of a sample table I could have:

			Patient ID		Condition		Breed
Sample1		1			Control		Breed1
Sample2		2			Control		Breed2
Sample3		3			Control		Breed1
Sample4		4			Control		Breed2
Sample5		1			Tumor		Breed1
Sample6		2			Tumor		Breed2
Sample7		3			Tumor		Breed1
Sample8		4			Tumor		Breed2

From what I understood, I don’t have anything to do with contrast in my case and I should always have ‘condition’ the latest factor in the model because it is the factor I want to analyse.

Another question:
If I use the pairwise information, I don’t have replicates because I only have two sample (one control, one tumor) for each patient. Is it better to use it (and then have no replicates) or not (and then have replicates for ‘control’ and ‘tumor’ samples) ?

Cheers,
Mathieu Bahin