[BioC] minimal number of features tested in edgeR

Stephanie [guest] guest at bioconductor.org
Thu Oct 25 11:23:20 CEST 2012


I have a question regarding the minimal number of genes that we can test in an analysis with edgeR. Let me explain, in a study,  edgeR have been used for testing the differential expression of three viruses between two conditions, without considering the counts on other features. That is, the data frame d$counts has only three lines (and 4 columns, as there is two replicates per condition). The library sizes, however, correspond to the total number of tags aligned both on these viruses and on the genes of the host organism. It seems inappropriate to me, as I don't understand how it would be possible to estimate reliably the dispersion from only three features, but maybe I'm wrong... May I have your opinion?
For you, what is the minimal number of features that we can test using edgeR?

Thank you by advance for your help.

Best regards,


 -- output of sessionInfo(): 

R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-linux-gnu (64-bit)

 [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] edgeR_2.6.2  limma_3.12.0

loaded via a namespace (and not attached):
 [1] annotate_1.34.0      AnnotationDbi_1.18.0 Biobase_2.16.0      
 [4] BiocGenerics_0.2.0   DBI_0.2-5            DESeq_1.8.2         
 [7] genefilter_1.38.0    geneplotter_1.34.0   grid_2.15.0         
[10] IRanges_1.14.3       RColorBrewer_1.0-5   RSQLite_0.11.1      
[13] splines_2.15.0       stats4_2.15.0        survival_2.36-14    
[16] xtable_1.7-0 

