[BioC] DESeq2 get coefficients for each variable in the design formula

Michael Love michaelisaiahlove at gmail.com
Tue Jul 1 20:57:41 CEST 2014


hi Quynh,

On Tue, Jul 1, 2014 at 2:38 PM, Tran, Nhu Quynh T <qtran1 at uthsc.edu> wrote:
> Hi Mike,
>
> What do you mean by "testing individual coefficients"?  I thought that when adding age in the design formula, the model will test the coefficient of GROUP in the presence of AGE, to check if age plays a confounding role in the changes in gene expression (if any) between the two groups.

Yes, when you have both terms in the design, the model fit by
DESeq() uses both terms, and the LFCs are different than the ones you
would get with ~ age or ~ group alone.

By "testing individual coefficients", I meant that, when you call
results() after fitting this model with both terms, and you specify
the group term, you are pulling out the group term and testing if it
is equal to 0.

>
> Quynh
>
> On Jul 1, 2014, at 1:18 PM, Michael Love wrote:
>
>> hi Quynh,
>>
>> The default pipeline is for testing individual coefficients, and the
>> code you have is fine, defining two sets based on the two tests, and
>> look at the union of these.
>>
>> It is also possible to perform a statistical test of both at once,
>> using a likelihood ratio test. There is a section in the vignette
>> explaining the LRT.
>>
>> This would look like (using dds to shorten your object name):
>>
>> dds <- DESeq(dds, test="LRT", reduced=~1)
>> res <- results(dds)
>>
>> You should then add the two log fold change columns from the results
>> tables you have above as columns to this new results table. (I'm still
>> working in the devel branch on making this part simpler, specifying
>> contrast LFCs to add to LRT results tables).
>>
>> Mike
>>
>> On Tue, Jul 1, 2014 at 2:03 PM, Tran, Nhu Quynh T <qtran1 at uthsc.edu> wrote:
>>> Hi,
>>>
>>> I'm working on a RNA-seq data set and would like to control for age or bmi in the model together with the disease status, which I am able to do.  My question is how can I get the coefficients for both age and the disease together?  what I did was
>>>
>>> #Adjust for age:
>>> cds.acromegaly.age = DESeqDataSetFromMatrix(countData=acromegaly.protein.coding, colData=acromegaly.mapping, design=~age+group)
>>> cds.acromegaly.age$age <- relevel(cds.acromegaly.age$age, "(0,40]")
>>> acromegaly.cds.age <- DESeq(cds.acromegaly.age)
>>> acromegaly.results.age <- results(acromegaly.cds.age)
>>>
>>> #Get the results for group (or disease):
>>> acromegaly.results.age <- results(acromegaly.cds.age)
>>>
>>> #Get the results for age:
>>> acro.age.effect <- results(acromegaly.cds.age, contrast=c("age","(40.60]", "(0.40]"))
>>> sum(acro.age.effect$padj<0.05, na.rm=TRUE)
>>>
>>> Then I merge those two results to get the genes that are affected by age or affected by group, and affected by both.   Is there another way? Is there coefficients for each of the independent variables.  I saw only log2FCs and p-values.
>>> acro.age.combined <- merge(acromegaly.results.age, acro.age.effect, by.x="row.names", by.y="row.names")
>>>
>>> Thanks,
>>> Quynh
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list