[BioC] GOLOCUSID and GOALLLOCUSID disagree with AmiGO

Robert Gentleman rgentlem at fhcrc.org
Sat Jan 29 00:56:01 CET 2005


As another data point,
why not just do (rather than the rather peculiar set of operations that  
you did?)
 > GOALLLOCUSID$"GO:0000158"
    IDA    IDA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    
  IEA
  24673   5520 116663  19053  24666  24668  24669  24672  24673  24674   
24675
    IEA    IEA    IMP    ISS    ISS    ISS    ISS    ISS    ISS    ISS    
  ISS
  25594  65179  45959 117281  19045  19046  19052  19053  19055  28227  
319520
    ISS    ISS    ISS    ISS    NAS     NR     NR    TAS    TAS
  45959  47877  63953  67857  45959   5518   5519  19052  24672
 > gg=GOALLLOCUSID$"GO:0000158"

Note that some of the LocusLink IDs are duplicated? Why you might ask,  
well because they are annotated there for two different reasons (there  
are two evidence codes)

 > sum(duplicated(gg))
[1] 6

A quick check suggests that all of the ones you have listed are there,  
as are some others (and you can verify whether they are right at NCBI  
if you want...

For example we have 5518 and AmiGO doesn't; my read of LocusLink says  
that they agree with us. It is reasonably simple to verify most of  
this, if that is what you want to do.

  Robert

On Jan 28, 2005, at 1:33 PM, Dick Beyer wrote:

> I am having some trouble understanding the correct usage of GOLOCUSID  
> and GOALLLOCUSID.  I can't get the list of LocusLink identifiers  
> output for a particular GOID to agree with AmiGO.  Also, for this  
> particular GOID, GO:0000158, the return from GOLOCUSID and  
> GOALLLOCUSID are the same, which seems wrong.  I am using the latest  
> development version of GO.
>
> Then again, perhaps I am not approaching this correctly as I have not  
> used these functions before.
>
> AmiGO shows 8 genes for GO:0000158, and both GOLOCUSID and  
> GOALLLOCUSID show 33.
>
> Would someone please look at the following code example and tell me  
> what I am doing wrong?
>
>> require("GO") || stop("GO unavailable")
>> myGOALLLOCUSID  <- as.list(GOALLLOCUSID)
>> allGOALLLOCUSID <- names(myGOALLLOCUSID) allGOALLLOCUSID <-  
>> sub("GO:","",allGOALLLOCUSID)
>> myGOLOCUSID     <- as.list(GOLOCUSID)
>> allGOLOCUSID    <- names(myGOLOCUSID) allGOLOCUSID    <-  
>> sub("GO:","",allGOLOCUSID)
>> which(allGOLOCUSID == "0000158")
> [1] 3370
>> myGOLOCUSID[3370]
> $"GO:0000158"
>    IDA    IDA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    IEA   
>   IEA    IEA    IEA
>  24673   5520 116663  19053  24666  24668  24669  24672  24673  24674   
> 24675  25594  65179
>    IMP    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS   
>   ISS    ISS    ISS
>  45959 117281  19045  19046  19052  19053  19055  28227 319520  39337   
> 45959  47877  63953
>    ISS    NAS     NR     NR    TAS    TAS    TAS
>  67857  45959   5518   5519  19052  24672   5516
>
>> which(allGOALLLOCUSID == "0000158")
> [1] 2856
>> myGOALLLOCUSID[2856]
> $"GO:0000158"
>    IDA    IDA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    IEA   
>   IEA    IEA    IEA
>  24673   5520 116663  19053  24666  24668  24669  24672  24673  24674   
> 24675  25594  65179
>    IMP    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS   
>   ISS    ISS    ISS
>  45959 117281  19045  19046  19052  19053  19055  28227 319520  39337   
> 45959  47877  63953
>    ISS    NAS     NR     NR    TAS    TAS    TAS
>  67857  45959   5518   5519  19052  24672   5516
>
>
>
> AmiGO tells me GO:0000158 has the genes:
> 19045	Ppp1ca
> 19046	Ppp1cb
> 19052	Ppp2ca
> 19053	Ppp2cb
> 19055	Ppp3ca
> 63953	Dusp10
> 319520	Dusp4
> 67857	Ppp6c
>
> base 2.0.1 datasets 2.0.1 utils 2.0.1 grDevices 2.0.1 graphics 2.0.1  
> stats 2.0.1 methods 2.0.1 tools 2.0.1 Biobase 1.5.0 reposTools 1.5.1  
> affy 1.5.8 matchprobes 1.0.12 gcrma 1.1.1 qvalue 1.1 siggenes 1.2.11  
> limma 1.8.6 GO 1.6.8 xtable 1.2-4
>
> Thanks very much for any help or suggestions,
> Dick
> *********************************************************************** 
> ********
> Richard P. Beyer, Ph.D.	University of Washington
> Tel.:(206) 616 7378	Env. & Occ. Health Sci. , Box 354695
> Fax: (206) 685 4696	4225 Roosevelt Way NE, # 100
> 			Seattle, WA 98105-6099
> http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
+----------------------------------------------------------------------- 
----------------+
| Robert Gentleman              phone: (206) 667-7700                    
          |
| Head, Program in Computational Biology   fax:  (206) 667-1319   |
| Division of Public Health Sciences       office: M2-B865               
       |
| Fred Hutchinson Cancer Research Center                                 
          |
| email: rgentlem at fhcrc.org                                              
                          |
+----------------------------------------------------------------------- 
----------------+



More information about the Bioconductor mailing list