[BioC] DESeq2 - paired sample, interaction and fold change
samuelcollombet at gmail.com
Thu Jun 5 10:35:44 CEST 2014
On 05/06/2014 01:49, Michael Love wrote:
> Maybe you left this out because it's a toy example, but we recommend
> you use dds$treatment=factor(c("A","Ct","A","Ct","A","Ct","A","Ct"),
> levels=c("Ct","A")), so that the comparisons will be treatment /
> control. Also with WT / tumor.
Thanks for the reminder, I did not understood it, now I got why
sometimes I got the inverse of what I wanted...
>> If I do:
>>> res <- results(dds, name="conditionWT.treatmentCt")
>> I will get the gene for which there is a specific effect of treatment in the tumor, am I right?
> yes (except for a sign change, see note above on factor levels).
>> I understood that DESeq2 does make test taking into consideration paired samples, if the pairing information is put as a factor in the design formula (I understood it from http://seqanswers.com/forums/archive/index.php/t-34614.html, post by simon anders, 10-21-2013, 12:28 AM).
> yes, by including the patient variable in the design, differences in
> counts which are due to patient differences are accounted for.
I am still not sure I get it, certainly because I am not familiar with
expanded design matrices...
the contrast "conditionWT.treatmentCt" will define the vector of
contrast (named c in the doc/paper) that, multiplied by the vector of
GLM coefficient Beta, will give the vector of coefficient that will be
used to calculate the logarithmic posterior, right?
Is it possible to get the contrast vector / pair of contrast vector that
correspond to a resultName?
>> Is it possible to get the individual fold change for each patients?
> Can you be more specific on which fold change? Are you interested in the tumor-treatment interaction term for each patient?
> How many replicates do you have for each combination of patient x condition x treatment?
> If not many, I'd recommend sticking to the general effect above.
We suspect that not all sample respond to the treatment, and so we would
like to know if the 'non-significance' of some genes is due to global
non-response, or 1 sample that is not responding at all (or if they are
all responding differently, so the variability is too high to be
I can get this kind of information by getting the rlogTransformation of
the counts and comparing the different samples, but these will not be
the exact values used for the test... I would like to be able to extract
these last ones.
More information about the Bioconductor