[BioC] GOstats and GenePix arrays

Thu May 11 21:12:59 CEST 2006

Thanks Thomas.  This is the kind of thing I was looking for.  Thanks to
all for their suggestions and encouragement.  

I know building a custom annotation package is the ideal scenario, but
for "niche" organisms (ie not human, rat, mouse) this isn't always
realistic.  Annotation is often gathered by hand and merged into tables,
etc. and it's sometimes difficult to conform to the BioC annotation
package standards.  It's nice to see functions that will work with both
standard BioC annotations as well as more generic tabular annotation.

Thanks,

Jake

On Thu, 2006-05-11 at 11:34 -0700, Thomas Girke wrote:
> > Jake,
> > I believe I have posted this description on the web for my own version
> > of GOhyperG which I called GOHyperGAll. I tried to implement this
> > function for my work with organisms that don't have locusID/chipID-to-GO
> > mappings. GOHyperGAll allows to work with your own gene-to-GO or chip_feature-to-GO mappings by 
> > providing your custom mapping file. Feel free to try this fuctions. It is 
> > available at:
> > http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/R_BioCondManual.html#GOHyperGAll
> > 
> > 
> > Thomas
> > 
> > 
> > 
> > On Thu 05/11/06 11:21, Jake wrote:
> > > Hi all,
> > > 
> > > I'm trying to use the "guts" of the GOHyperG function in GOstats as a
> > > basis for a similar function for GenePix data.  I've found a basic
> > > description of the phyper function in the context of GO:
> > > 
> > > # How to implement phyper function for GO analysis
> > > #       phyper(x-1, m, n-m , k, lower.tail = FALSE)
> > > #       x: number of sample genes at GO node (can be vector with many
> > > entries)
> > > #       m: number of genes at GO node (works with vector of same length
> > > as x)
> > > #       n: number of unique genes at all GO nodes
> > > #       k: number of unique genes in test sample that have GO mappings
> > > 
> > > Values for x and k seem straightforward, but I'm wondering about m and
> > > n.  The arrays we're working with seem to have fewer genes on them than
> > > the total number cataloged in the organism's online databases.  So
> > > should m and n be based on the absolute total number of genes annotated,
> > > or the number of genes annotated *on the chip*?
> > > 
> > > Thanks in advance,
> > > 
> > > Jake
> > > 
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> > > 
> > 
> 
> 

-- 
Thomas Girke, Ph.D.
1008 Noel T. Keen Hall
Center for Plant Cell Biology (CEPCEB)
University of California
Riverside, CA 92521

E-mail: thomas.girke at ucr.edu
Website: http://faculty.ucr.edu/~tgirke
Ph: 951-827-2469
Fax: 951-827-4437