[BioC] Deseq2 for down stream analysis

Michael Love michaelisaiahlove at gmail.com
Sun Aug 10 17:44:29 CEST 2014


On Sun, Aug 10, 2014 at 9:41 AM, Fabrice Tourre <fabrice.ciup at gmail.com> wrote:
> Dear Mike,
>
> Thank you for your reply. I need a matrix for each gene and sample for
> gene set enrichment analysis.
>
> In you example, how will about this situation:
>
> [0,0,0] vs [1,2,3]
>
> [0,0,0] vs [10,10,10]
>

In my previous email, I was just trying to illustrate the concept.
Better that you rank your results table by LFC and by p-value to see
for yourself the difference on real data.

If your downstream method is designed to take as input expression
matrices similar to normalized microarray datasets (log scale) then
you can use rlog or VST, and use the matrix accessed with
assay(object).

If the downstream method is designed to take as input RNA-Seq counts,
then you shouldn't use our transformations, as typically count-based
methods have special requirements on the properties of the input data.

It's up to you to read the documentation of the downstream method and
figure out which should be the input, or if uncertain, email the
maintainers of that software.

Mike

> I have a lot such case genes.
>
> On Sun, Aug 10, 2014 at 9:29 PM, Michael Love
> <michaelisaiahlove at gmail.com> wrote:
>> hi Fabrice,
>>
>> On Sun, Aug 10, 2014 at 8:27 AM, Fabrice Tourre <fabrice.ciup at gmail.com> wrote:
>>> Dear expert,
>>>
>>> I've been using DESeq for my RNA-Seq differential expression analysis.
>>> Now I want to do GSEA. I have got follow expression value. which one
>>> should I used for the down stream analysis?
>>
>> Please provide more details about the downstream analysis.
>>
>> Do you need a matrix of values for each gene and sample, or just the
>> test statistic for each gene?
>>
>>> rc, rld or vsd?
>>>
>>> rc <- counts(dds)
>>> rld <- rlog(dds)
>>> vsd <- varianceStabilizingTransformation(dds)
>>> rlogMat <- assay(rld)
>>> vstMat <- assay(vsd)
>>>
>>> Then I want to use the DESeq result to generate a ranked-list, which
>>> will be used as the input in GSEA. My question is: Should I rank the
>>> genes using the fold changes or using the q-values?
>>>
>>
>> You can use the shrunken fold changes or p-values for ranking. The
>> fold change measures the effect itself, while the p-value is a
>> function of how distinct the changes are, so the signal over the
>> noise. For example, consider a comparison of two groups with three
>> values each (here continuous values just for demonstration): [3,4,5]
>> vs [1,2,3] has a fold change of 2, whereas [11,11,11] vs [10,10,10]
>> has a fold change of 1.1. but the second comparison will have a lower
>> p-value because the variance within groups is so small.
>>
>> Mike
>>
>>> Thank you very much in advance.
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list