[BioC] R: Re: R: Re: R: Re: Deseq2 and differentia expression

Tue Jul 15 09:48:40 CEST 2014

Hi Jarod

On 14/07/14 17:12, jarod_v6 at libero.it wrote:
 > So I have attached all the mails.

Please always take your time to read through the replies before asking 
the next question. And please structure your question in the form of 
eplaining first what you did, then what you expected to get that way, 
and finally what you got instead. Just pasting together all the various 
pieces of code you have tried so far is not helpful for us to figure out 
the issue.

And, as both Mike and I asked you before multiple times: Please do not 
crosspost to the bioc-level list but do keep the bioconductor list on 
CC. These two lists are not the same!

> However the problem remain and I think it is always the same.

You have posted below snippets of all the different pieces of code that 
you have posted so far. Of course, your problem remains, if you keep 
using the same code.

What you should do is to use

resGA2 <- results(dds, lfcThreshold=1, altHypothesis="greaterAbs")

to do the test you want. Why are you still trying to use other ways to 
call 'results' even though I have explained to you twice that they are 
not what you want and you agreed?

Then, you should use 'subset' to find the significant genes.

You don't get any significant hits this way. It seems that your noise 
level is too high to say for any of your genes with statistical 
confidence that the log2 fold change is stronger than +/-1. Maybe you 
should use a smaller threshold.

Note in this context that our suggestion to test for significance of 
exceeding a threshold is not (yet?) that established in expression 
analysis. A more common approach is to get a list of all genes whose
fold-change is significantly different from zero, and then narrow down 
this list to those genes whose _estimate_ shows a log2 fold change 
stronger than +/-1. As we explain in our paper, we feel that this 
standard procedure is not appropriate to do what people actually want to do.

However, in terms of stringency, _testing_ for an absolute log2 fold 
change of more than 1 is much more stringent than testing for a 
non-_zero_ fold-change and then using only those with an LFC estimate 
exceeding 1. Hence, if you want to get a significance level comparable 
to the latter by doing the former, you need to be more lenient with your 
threshold.

   Simon