[BioC] EdgeR: dispersion estimation

Dear Gordon,

Thank you so much for your reply and help. I greatly appreciate it.

My data is exactly count data, which are integers. I got the same results after I updated R/edgeR and rerun the code.

I have some questions related to the dispersion estimation using edgeR package. Briefly introduction of my multi-factor project (please refer to my previous post for more details), I have three factors: L (16 levels), S (2 levels) and R(3 levels), so here I totally have 16 x 2 x 3 = 96 different conditions.

1. Due to some reasons, one of the condition has only one replicate, all the other conditions have at least 5 replicates. In this situation, how does edgeR estimate the common dispersion and tagwise dispersion?

2. I searched quite a lot about the examples using edgeR to estimate the dispersion (including those examples shown in edgeR user guide), I found that the common dispersion was not greater than 1 in most of cases, however, I got 3.999943 for the common dispersion and 0.0624991 for all of the tagwise dispersion. When the tagwise dispersions approach to same value, shouldn't they be close to the common dispersion? 

3. Use current version of edgeR, I tried different values for prior.df (including the default) in the input of estimateTagwiseDisp function. However, I always got the same results for tagwise dispersion estimates. Are there any other input parameters in estimateTagwiseDisp function that will affect the estimate results? And users can input the values for these parameter according to their own projects? 

Please feel free to correct me if I make some mistakes here. I will also greatly appreciate it if you can provide any other suggestions.

Dear Yanzhu,

My guess is that some of your "count data" are not integers.  For example, 
are they perhaps expected counts from RSEM?  In the edgeR version that you 
are using, the GLM dispersion estimation functions do not work correctly 
for non-integer data.  (They weren't intended to.)

Please update your copyies of R and edgeR to the latest versions. 
Bioconductor 2.14 was released a couple of weeks ago.  All edgeR functions 
now permit non-integer "counts".

Also check that your data are counts and not RPKM or similar.  The counts 
should sum to the total sequence depth for each sample.

