[BioC] about columm "size" in out of hyperGtest ( Gostat package)

Marc Carlson mcarlson at fhcrc.org
Fri May 29 00:58:22 CEST 2009


Hi Greg,

I am a little confused by your description of what you did.  After
peering at your explanation, I am still not completely certain that I
understand what your question is.  But, there is a nice description of
how the gene universe can affect the number of things you find in the
GOstats vignette titled "Hypergeometric Tests Using GOstats".  Perhaps
this can help you?

http://www.bioconductor.org/packages/devel/bioc/html/GOstats.html


  Marc




gregory voisin wrote:
> Hi,
>
> I need a precision about columm "size" in out of hyperGtest ( Gostat package)
>
> In https://stat.ethz.ch/pipermail/bioconductor/2006-December/015346.html
> we can read: "The "Size" column is the number of genes annotated at the given GO
> term (where genes are restricted to the defined gene universe)"
> Hence, for a given Term and given platform, we must have a constant number.
>
> I explain: 
> first set data :  A contains 687 probesets
> I practise a hyperGotest:
> This is an extract from the result:
> GOBPID       Pvalue OddsRatio    ExpCount Count Size     Term
>
> 36  GO:0008283 0.0180640706  1.913970  8.20236088    15  266     cell proliferation
>
> If I inderstand well: 266 probesets on affy HGU133.2.plus are annotated "cell proliferation"
>
>
> Then, 
>
> I practise the same analysis on a second set (B) , inclusive of A :  414 probesets
>
> result :
>     GOBPID    Pvalue OddsRatio    ExpCount Count          Size        Term
>
> 20 GO:0008283     0.008295992  1.765957         14.97834828    25                         745                 cell proliferation
>
> Here, that's mean that 745 probesets are annotated "cell proliferation"
>
>
>
> Why the number of size for the same term is not the same?
>
> Moreover, B being inclusive of A , the 25 probesets annotated "cell proliferation " , discovered in B analysis are reduced to 15 probesets in A analysis. Normally, in A analysis, I should have at least 25 probesets annotated "cell proleferation".
>
> Why didn't I find at least 25 probesets in A analysis ?
>
>
>
>
> Thanks 
> Greg
>
>
>   
>> sessionInfo()
>>     
> R version 2.8.1 (2008-12-22) 
> i386-pc-mingw32 
>
> locale:
> LC_COLLATE=French_Canada.1252;LC_CTYPE=French_Canada.1252;LC_MONETARY=French_Canada.1252;LC_NUMERIC=C;LC_TIME=French_Canada.1252
>
> attached base packages:
> [1] splines   tools     stats     graphics  grDevices utils     datasets  methods   base     
>
> other attached packages:
>  [1] GOstats_2.8.0        Category_2.8.4       genefilter_1.22.0    survival_2.34-1      RBGL_1.18.0          annotate_1.20.1      xtable_1.5-4        
>  [8] graph_1.20.0         GO.db_2.2.5          hgu133plus2.db_2.2.5 RSQLite_0.7-1        DBI_0.2-4            AnnotationDbi_1.4.3  Biobase_2.2.2       
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.11 GSEABase_1.4.0  XML_1.99-0 
>
>
>       
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list