[BioC] GOLOCUSID and GOALLLOCUSID disagree with AmiGO

Dick Beyer dbeyer at u.washington.edu
Sat Jan 29 02:12:14 CET 2005


Hi Robert,

Thanks for showing me how to get the LL the easy way.

When I submit the LL list to S.O.U.R.C.E., I see that they are all species (from looking at the UGCluster), whereas I actually want just mouse.

Is there an easy way to filter by species?  If not, would it be possible to build something like GOLOCUSIDMUSMUSCULUS?

My goal is to feed a set of LLs to GOstats, get a list of ranked GOIDs, pick the top most significant, generate a list of LLs from each those GOIDs, but just for a particular species, then go back to the microarray results and see what this last list of LLs is doing in my experiment.

Please let me know if you think there is a better way.

Thanks very much,
Dick
*******************************************************************************
Richard P. Beyer, Ph.D.	University of Washington
Tel.:(206) 616 7378	Env. & Occ. Health Sci. , Box 354695
Fax: (206) 685 4696	4225 Roosevelt Way NE, # 100
 			Seattle, WA 98105-6099
http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
*******************************************************************************

On Fri, 28 Jan 2005, Robert Gentleman wrote:

> As another data point,
> why not just do (rather than the rather peculiar set of operations that you 
> did?)
>> GOALLLOCUSID$"GO:0000158"
>   IDA    IDA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    IEA
> 24673   5520 116663  19053  24666  24668  24669  24672  24673  24674  24675
>   IEA    IEA    IMP    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS
> 25594  65179  45959 117281  19045  19046  19052  19053  19055  28227 319520
>   ISS    ISS    ISS    ISS    NAS     NR     NR    TAS    TAS
> 45959  47877  63953  67857  45959   5518   5519  19052  24672
>> gg=GOALLLOCUSID$"GO:0000158"
>
> Note that some of the LocusLink IDs are duplicated? Why you might ask, well 
> because they are annotated there for two different reasons (there are two 
> evidence codes)
>
>> sum(duplicated(gg))
> [1] 6
>
> A quick check suggests that all of the ones you have listed are there, as are 
> some others (and you can verify whether they are right at NCBI if you want...
>
> For example we have 5518 and AmiGO doesn't; my read of LocusLink says that they 
> agree with us. It is reasonably simple to verify most of this, if that is what 
> you want to do.
>
> Robert
>
> On Jan 28, 2005, at 1:33 PM, Dick Beyer wrote:
>
>> I am having some trouble understanding the correct usage of GOLOCUSID and 
>> GOALLLOCUSID.  I can't get the list of LocusLink identifiers output for a 
>> particular GOID to agree with AmiGO.  Also, for this particular GOID, 
>> GO:0000158, the return from GOLOCUSID and GOALLLOCUSID are the same, which 
>> seems wrong.  I am using the latest development version of GO.
>> 
>> Then again, perhaps I am not approaching this correctly as I have not used 
>> these functions before.
>> 
>> AmiGO shows 8 genes for GO:0000158, and both GOLOCUSID and GOALLLOCUSID show 
>> 33.
>> 
>> Would someone please look at the following code example and tell me what I 
>> am doing wrong?
>> 
>>> require("GO") || stop("GO unavailable")
>>> myGOALLLOCUSID  <- as.list(GOALLLOCUSID)
>>> allGOALLLOCUSID <- names(myGOALLLOCUSID) allGOALLLOCUSID <- 
>>> sub("GO:","",allGOALLLOCUSID)
>>> myGOLOCUSID     <- as.list(GOLOCUSID)
>>> allGOLOCUSID    <- names(myGOLOCUSID) allGOLOCUSID    <- 
>>> sub("GO:","",allGOLOCUSID)
>>> which(allGOLOCUSID == "0000158")
>> [1] 3370
>>> myGOLOCUSID[3370]
>> $"GO:0000158"
>>    IDA    IDA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    IEA 
>> IEA    IEA
>>  24673   5520 116663  19053  24666  24668  24669  24672  24673  24674  24675 
>> 25594  65179
>>    IMP    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS 
>> ISS    ISS
>>  45959 117281  19045  19046  19052  19053  19055  28227 319520  39337  45959 
>> 47877  63953
>>    ISS    NAS     NR     NR    TAS    TAS    TAS
>>  67857  45959   5518   5519  19052  24672   5516
>> 
>>> which(allGOALLLOCUSID == "0000158")
>> [1] 2856
>>> myGOALLLOCUSID[2856]
>> $"GO:0000158"
>>    IDA    IDA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    IEA    IEA 
>> IEA    IEA
>>  24673   5520 116663  19053  24666  24668  24669  24672  24673  24674  24675 
>> 25594  65179
>>    IMP    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS    ISS 
>> ISS    ISS
>>  45959 117281  19045  19046  19052  19053  19055  28227 319520  39337  45959 
>> 47877  63953
>>    ISS    NAS     NR     NR    TAS    TAS    TAS
>>  67857  45959   5518   5519  19052  24672   5516
>> 
>> 
>> 
>> AmiGO tells me GO:0000158 has the genes:
>> 19045	Ppp1ca
>> 19046	Ppp1cb
>> 19052	Ppp2ca
>> 19053	Ppp2cb
>> 19055	Ppp3ca
>> 63953	Dusp10
>> 319520	Dusp4
>> 67857	Ppp6c
>> 
>> base 2.0.1 datasets 2.0.1 utils 2.0.1 grDevices 2.0.1 graphics 2.0.1 stats 
>> 2.0.1 methods 2.0.1 tools 2.0.1 Biobase 1.5.0 reposTools 1.5.1 affy 1.5.8 
>> matchprobes 1.0.12 gcrma 1.1.1 qvalue 1.1 siggenes 1.2.11 limma 1.8.6 GO 
>> 1.6.8 xtable 1.2-4
>> 
>> Thanks very much for any help or suggestions,
>> Dick
>> *********************************************************************** 
>> ********
>> Richard P. Beyer, Ph.D.	University of Washington
>> Tel.:(206) 616 7378	Env. & Occ. Health Sci. , Box 354695
>> Fax: (206) 685 4696	4225 Roosevelt Way NE, # 100
>> 			Seattle, WA 98105-6099
>> http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> 
>> 
> +----------------------------------------------------------------------- 
> ----------------+
> | Robert Gentleman              phone: (206) 667-7700 
> |
> | Head, Program in Computational Biology   fax:  (206) 667-1319   |
> | Division of Public Health Sciences       office: M2-B865                    |
> | Fred Hutchinson Cancer Research Center 
> |
> | email: rgentlem at fhcrc.org 
> |
> +----------------------------------------------------------------------- 
> ----------------+
>
>



More information about the Bioconductor mailing list