[BioC] Gostats with Yeast annotation

Marc Carlson mcarlson at fhcrc.org
Wed Feb 11 18:05:31 CET 2009


Hi Yolande,

Do do not need the other version to run GOstats.  I only mentioned it in
case you (or others) really want/need an entrez gene ID for some other
reason.  The one exception to this would be if you presently only had
Entrez gene IDs already and needed a way to convert those back into the
systematic names.  That would be a pretty unusual case though since most
people working with Yeast use the systematic names and not entrez gene
IDs.  To use GOstats with yeast, you should always just use the
systematic names (like YAL002W for example).


  Marc




Yolande Tra wrote:
> Hi Marc,
>  
> Thank you for the info. If I want to do a gene set enrichment analysis
> using KEGG, do I need the other version that has entrez gene ID. If
> yes, would you point to me the link to find this version of the
> package. Thanks again.
>  
> Yolande
>
> ------------------------------------------------------------------------
> *From:* Marc Carlson [mailto:mcarlson at fhcrc.org]
> *Sent:* Tue 2/10/2009 3:05 PM
> *To:* Yolande Tra
> *Cc:* ag357 at cam.ac.uk; bioconductor at stat.math.ethz.ch
> *Subject:* Re: [BioC] Gostats with Yeast annotation
>
> Hi Yolande,
>
> Unlike the "eg" packages, the annotation package org.Sc.sgd.db is based
> on sgd instead of NCBI. That means that the central IDs are the
> systematic Yeast identifiers that you can see in the examples below
> (YAL002W for example). So these IDs (and not entrez gene IDs) become the
> currency for dealing with GOstats when using yeast. Alternatively, if
> you needed an entrez gene ID for some other reason, the version of this
> package that is found in the devel branch will let you get one of those.
>
> Marc
>
>
>
> Yolande Tra wrote:
> > Hi Alex,
> >
> > I need some help. For yeast with two-color microarray, do you know
> which identifier to use (there is no org.Sc.sgdENTREZID)in the
> >
> > ls("package:org.Sc.sgd)
> > [1] "org.Sc.sgd"        "org.Sc.sgd_dbconn"    
> "org.Sc.sgd_dbfile"     "org.Sc.sgd_dbInfo"   
> >  [5] "org.Sc.sgd_dbschema" "org.Sc.sgdALIAS"   
> "org.Sc.sgdCHR"         "org.Sc.sgdCHRLENGTHS"
> >  [9] "org.Sc.sgdCHRLOC" "org.Sc.sgdCOMMON2ORF" 
> "org.Sc.sgdDESCRIPTION" "org.Sc.sgdENZYME"    
> > [13] "org.Sc.sgdENZYME2ORF" "org.Sc.sgdGENENAME"
> "org.Sc.sgdGO"        "org.Sc.sgdGO2ALLORFS"
> > [17] "org.Sc.sgdGO2ORF"  "org.Sc.sgdINTERPRO"   
> "org.Sc.sgdMAPCOUNTS"  "org.Sc.sgdORGANISM"  
> > [21] "org.Sc.sgdPATH"    "org.Sc.sgdPATH2ORF"   
> "org.Sc.sgdPFAM"       "org.Sc.sgdPMID"      
> > [25] "org.Sc.sgdPMID2ORF"    "org.Sc.sgdREJECTORF"  
> "org.Sc.sgdSMART" .db")
> >
> > Yolande
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: bioconductor-bounces at stat.math.ethz.ch on behalf of Alex
> Gutteridge
> > Sent: Fri 7/25/2008 4:35 AM
> > To: bioconductor at stat.math.ethz.ch
> > Subject: Re: [BioC] Gostats with Yeast annotation
> > 
> > Hi,
> >
> > Just to confirm the org.Sc.sgd.db package and GOstats seem to work 
> > fine together for me in Bioc-devel (Sample session pasted below).
> >
> > R version 2.8.0 Under development (unstable) (2008-07-22 r46103)
> > Copyright (C) 2008 The R Foundation for Statistical Computing
> > ISBN 3-900051-07-0
> > R is free software and comes with ABSOLUTELY NO WARRANTY.
> > You are welcome to redistribute it under certain conditions.
> > Type 'license()' or 'licence()' for distribution details.
> >   Natural language support but running in an English locale
> > R is a collaborative project with many contributors.
> > Type 'contributors()' for more information and
> > 'citation()' on how to cite R or R packages in publications.
> > Type 'demo()' for some demos, 'help()' for on-line help, or
> > 'help.start()' for an HTML browser interface to help.
> > Type 'q()' to quit R.
> > [Previously saved workspace restored]
> >  > library(Category)
> > Loading required package: Biobase
> > Loading required package: tools
> > Welcome to Bioconductor
> >   Vignettes contain introductory material. To view, type
> >   'openVignette()'. To cite Bioconductor, see
> >   'citation("Biobase")' and for packages 'citation(pkgname)'.
> > Loading required package: graph
> > Loading required package: annotate
> > Loading required package: AnnotationDbi
> > Loading required package: DBI
> > Loading required package: RSQLite
> > Loading required package: xtable
> > Loading required package: genefilter
> > Loading required package: survival
> > Loading required package: splines
> >  > library(GOstats)
> > Loading required package: GO.db
> > Loading required package: RBGL
> >  > sel = readLines("Turbidostat.genes")
> >  > uni = readLines("all.genes")
> >  > params = 
> > new
> > ("GOHyperGParams
> > ",geneIds
> > =
> > sel
> > ,universeGeneIds
> > =
> > uni
> > ,annotation
> > =
> > "org
> > .Sc
> > .sgd
> > .db
> > ",ontology="BP",pvalueCutoff=0.1,conditional=FALSE,testDirection="over")
> >  > over = hyperGTest(params)
> >  > summary(over)
> >                GOBPID       Pvalue OddsRatio    ExpCount Count Size
> > GO:0006412 GO:0006412 1.109223e-16  2.755286  62.1352567   125  383
> > GO:0010467 GO:0010467 4.482367e-14  1.824627 225.8284001   317 1392
> > GO:0009059 GO:0009059 8.256804e-14  2.232463  89.0659423   154  549
> > GO:0043170 GO:0043170 8.691113e-13  1.698656 414.6676658   510 2556
> > GO:0044267 GO:0044267 5.484817e-12  1.746362 214.3098539   296 1321
> > GO:0019538 GO:0019538 1.268805e-11  1.718287 224.6927688   306 1385
> > [..snip..]
> >  > q()
> > Save workspace image? [y/n/c]: n
> > ag357 at ag357-pc2102:~/Desktop/study> head Turbidostat.genes
> > YAL001C
> > YAL002W
> > YAL003W
> > YAL005C
> > YAL008W
> > YAL009W
> > YAL010C
> > YAL011W
> > YAL019W
> > ag357 at ag357-pc2102:~/Desktop/study> head all.genes
> > YHR047C
> > YHR051W
> > YHR066W
> > YHR068W
> > YHR075C
> > YHR076W
> > YHR080C
> > YHR083W
> > YHR143W-A
> > YKL137W
> >
> > AlexG
> >
> > On 22 Jul 2008, at 18:07, Robert Gentleman wrote:
> >
> >  
> >> Hi Alex,
> >>  If you are willing to use R-devel and Bioc-devel, the issue should 
> >> be fixed there.  I would be interested in hearing of any problems 
> >> you might have (or successes) using that version.  I am waiting for 
> >> some reports of success before I port this to release,
> >>
> >> best wishes
> >>  Robert
> >>
> >>
> >> Alex Gutteridge wrote:
> >>    
> >>> Hi,
> >>> I've been trying to use the hyperGTest method from the GOstats 
> >>> package with some yeast ORF data. I notice in this thread from a 
> >>> month or so ago that there are problems at the moment with using 
> >>> any of the yeast annotation sets apart from 'YEAST' (which is 
> >>> deprecated) due to missing ID2EntrezID methods:
> >>> https://stat.ethz.ch/pipermail/bioconductor/2008-June/022697.html
> >>> I just wanted to make sure that this was still the case and I guess 
> >>> fish around for an estimated ETA for when the org.Sc.sgd.db 
> >>> annotations (which are replacing YEAST as I understand it) will be 
> >>> compatible with hyperGTest?
> >>> Also, is the exact source of GO annotations used in these packages 
> >>> documented anywhere? Looking in the DESCRIPTION file I see 
> >>> 'primarily based on mapping using ORF identifiers from SGD' for 
> >>> org.Sc.sgd.db and 'assembled using data from public data 
> >>> repositories' for YEAST. Should I just take it these are based on 
> >>> the SGD GO annotation file from the date given in the Packaged 
> >>> field of the DESCRIPTION file? For YEAST there is
> >>>      
> >> the man page is pretty explicit, (?org.Sc.sgdGO)
> >>
> >>     Mappings were based on data provided by: Yeast Genome (
> >>     ftp://genome-ftp.stanford.edu/pub/yeast/data_download ) on
> >>     2008-Mar29
> >>
> >> I am not sure what more we could put there.
> >>
> >> best wishes
> >>  Robert
> >>
> >>    
> >>> also a Created field which is aprox. 1 month prior to the Packaged 
> >>> date so I'm guessing the real age of the data is that one? The 
> >>> yeast annotations change so quickly it's useful to be able to pin 
> >>> this down as accurately as possible.
> >>> Thanks in advance for any help with these questions.
> >>> Alex Gutteridge
> >>> Department of Biochemistry
> >>> University of Cambridge
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>      
> >> --
> >> Robert Gentleman, PhD
> >> Program in Computational Biology
> >> Division of Public Health Sciences
> >> Fred Hutchinson Cancer Research Center
> >> 1100 Fairview Ave. N, M2-B876
> >> PO Box 19024
> >> Seattle, Washington 98109-1024
> >> 206-667-7700
> >> rgentlem at fhcrc.org
> >>
> >>    
> >
> > Alex Gutteridge
> >
> > Department of Biochemistry
> > University of Cambridge
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >  
>



More information about the Bioconductor mailing list