[BioC] Odd Odds Ratio in GOstats package

James W. MacDonald jmacdon at uw.edu
Mon Apr 28 16:34:51 CEST 2014


Hi Moreno,

The odds ratio is

odds of being GO term X | in significant set/odds of being GO term X | 
not in significant set

If all of the genes mapped to GO term X are in the significant set, the 
odds of being mapped to GO term X and not being in the significant set 
are zero. Divide by zero, you get Inf.

Best,

Jim


On 4/28/2014 6:42 AM, Moreno Colaiacovo wrote:
> Dear All
>
>   
>
> I am using the GOstats package to calculate enriched Gene Ontology terms in
> a gene set. I followed the steps described in the manual, however for some
> genes the summary table gives an Odds Ratio equal to "Inf". This always
> happens when the gene counts and the universe counts are the same. Can
> someone please explain me this result?
>
>   
>
> See for example:
>
>   
>
>> summary(hgOver_BP)[summary(hgOver_BP)$OddsRatio=="Inf",]
>           GOBPID       Pvalue OddsRatio  ExpCount Count Size
>
> 211  GO:0015985 6.838762e-08       Inf 4.3127788    14   14
>
> 212  GO:0015986 6.838762e-08       Inf 4.3127788    14   14
>
> 320  GO:0042026 7.645560e-06       Inf 3.0805563    10   10
>
> 360  GO:0000028 2.485174e-05       Inf 2.7725007     9    9
>
> 361  GO:0000338 2.485174e-05       Inf 2.7725007     9    9
>
> 362  GO:0006613 2.485174e-05       Inf 2.7725007     9    9
>
> 363  GO:0010388 2.485174e-05       Inf 2.7725007     9    9
>
> 404  GO:0009188 8.076815e-05       Inf 2.4644450     8    8
>
>   
>
> How is the OddsRatio calculated here? According to the definition of Odds
> Ratio that I know, I don't see why two equal counts should give an infinite
> number.
>
>   
>
> Many thanks in advance
>
>   
>
> Best regards
>
> Moreno
>
>   
>
>> sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>   
> locale:
> [1] LC_COLLATE=Italian_Italy.1252
> [2] LC_CTYPE=Italian_Italy.1252
> [3] LC_MONETARY=Italian_Italy.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=Italian_Italy.1252
>   
> attached base packages:
> [1] parallel  stats     graphics  grDevices
> [5] utils     datasets  methods   base
>   
> other attached packages:
>   [1] org.Hs.eg.db_2.10.1  org.Mm.eg.db_2.10.1
>   [3] GOstats_2.28.0       graph_1.40.1
>   [5] Category_2.28.0      GO.db_2.10.1
>   [7] Matrix_1.1-3         ReactomePA_1.6.1
>   [9] AnnotationDbi_1.24.0 Biobase_2.22.0
> [11] BiocGenerics_0.8.0   RSQLite_0.11.4
> [13] DBI_0.2-7            biomaRt_2.18.0
> [15] xlsx_0.5.5           xlsxjars_0.6.0
> [17] rJava_0.9-6
>   
> loaded via a namespace (and not attached):
>   [1] annotate_1.40.1       AnnotationForge_1.4.4
>   [3] colorspace_1.2-4      dichromat_2.0-0
>   [5] digest_0.6.4          DO.db_2.7
>   [7] DOSE_2.0.0            genefilter_1.44.0
>   [9] ggplot2_0.9.3.1       GOSemSim_1.20.3
> [11] graphite_1.8.1        grid_3.0.2
> [13] GSEABase_1.24.0       gtable_0.1.2
> [15] igraph_0.7.0          IRanges_1.20.7
> [17] labeling_0.2          lattice_0.20-29
> [19] MASS_7.3-31           munsell_0.4.2
> [21] plyr_1.8.1            proto_0.3-10
> [23] qvalue_1.36.0         RBGL_1.38.0
> [25] RColorBrewer_1.0-5    Rcpp_0.11.1
> [27] RCurl_1.95-4.1        reactome.db_1.46.1
> [29] reshape2_1.2.2        scales_0.2.3
> [31] splines_3.0.2         stats4_3.0.2
> [33] stringr_0.6.2         survival_2.37-7
> [35] tcltk_3.0.2           tools_3.0.2
> [37] XML_3.98-1.1          xtable_1.7-3
>
>   
>
>   
>
> ===============================
>
> Moreno Colaiacovo
>
> Computational Biologist
>
> Genomnia srl
>
> Via Nerviano, 31/B - 20020 Lainate (MI)
>
> Tel. 0293305.711 - Fax 0293305.777
>
>   <http://www.genomnia.com/> www.genomnia.com
>
> moreno.colaiacovo at genomnia.com
>
>   
>
> P     Per cortesia, prima di stampare questa e-mail pensate all'ambiente.
>
>                       Please consider the environment before printing this
> mail note.
>
>   
>
>   
>
>
>
> -----------------------------------------------------------
> Il Contenuto del presente messaggio potrebbe contenere informazioni confidenziali a favore dei
> soli destinatari del messaggio stesso. Qualora riceviate per errore questo messaggio siete pregati
> di cancellarlo dalla memoria del computer e di contattare i numeri sopra indicati. Ogni utilizzo o
> ritrasmissione dei contenuti del messaggio da parte di soggetti diversi dai destinatari e da
> considerarsi vietato ed abusivo.
>
> The information transmitted is intended only for the per...{{dropped:10}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list