[BioC] Can DESeq2 find tissue-specific expression genes
anders at embl.de
Fri Dec 20 10:04:16 CET 2013
On 20/12/13 04:03, Eman Lee [guest] wrote:
> How can we use DESeq2 to determine which genes are tissue-specific expression?
> Can we take Tissue1 as CASE, other 9 tissues as CONTROL, to find tissue-specific genes (high expression in Tissue1) ?
In principle, yes. However, with this setting, DESeq2 would look for
genes where Case differs from Control much more than the Control samples
differ from each other, i.e., extreme cases where a gene has very
similar expression in all but one tissues, and a very different one in
this remaining tissue. If a gene sticks out in more than a single tissue
(e.g., strong in two and weak in eight tissues), you wouldn't find it.
The conventional way would be to do a likelihood ratio test to see
whether the tissue effect is significant, i.e., compare the models
"count ~ tissue" against "count ~ 1".
You can then look at the shrunken log fold changes reported by DESeq2
for the indivdual tissues to find out which tissue(s) are different.
Or you do Wald tests (DESeq2 offers both likelihood ratio tests and Wald
tests) and use the Wald test p values to find tissues which differ
significantly from the average for a gene.
In standard linear modelling, you have to assign one of the tissues as
your "base level". It gets absorbed into the model's intercept and all
other tissues' expressions are reported relative to it, and the log fold
changes get shrunken towards it (if you use DESeq2's coefficient
shrinkage). This is undesirable as it makes one tissue special. To solve
this, we have, very recently, implemented "expanded design matrices" in
the devel version of DESeq2, and this might be quite useful for you.
(The original motivation was also a search for tissue-specific usage, in
that case of exons; see Reyes et al., PNAS 2013, 110:15377).
More information about the Bioconductor