[BioC] error while using goProfiles package on arabidopsis entrez gene IDs

Marc Carlson mcarlson at fhcrc.org
Sat Aug 24 01:07:50 CEST 2013


Hi,

So now I can see a little better what you are doing.  The problem is 
what is happening inside of goProfiles.  Now this is not my package and 
I have never really used it much myself, so I just did a little 
debugging to see what was happening, and this is what I found:

The basicProfile() function is expecting you to give it a central ID for 
the org package you name for it.  It seems to be assuming that this will 
be an entrez gene ID.  But that is *not* what the arabidopsis community 
usually uses.  That community likes to use TAIR IDs.  So the 
org.At.tair.db, uses TAIR IDs as the central ID (this is why TAIR is in 
the middle of the package name).  You can get and use entrez gene IDs 
with the org.At.tair.db package, but they are not the central id that is 
expected by many of the older methods like mget() etc.   These days, we 
have moved away from that model and now use the select method.  We feel 
it's less confusing since there is no longer the need to pay attention 
to which key type is most important for a package etc.  Instead the 
select() interface just asks you to provide the kind of key that you are 
using.  We feel this is more transparent.

So anyways here is how I was able to make it run:

## 1st take some of your entrez gene IDs
egIDs <- c("839235", "838362", "838961", "837091", "837455", "837543")

## use select to quickly translate these into TAIR IDs, and then grab 
that column of IDs back out.
## (You may find it more convenient to just start with the TAIR IDs that 
you said were in your file, but I don't have those here)
tairIDs <-  as.character(select(org.At.tair.db, keys=egIDs, cols="TAIR", 
keytype="ENTREZID")[[2]])

## THEN call basicProfile function and pass in tair IDs instead...
## Now when it calls mget on the GO mapping, it will actually get some 
matches.
basicProfile(tairIDs, idType="Entrez", onto ="ANY", level=2, 
orgPackage="org.At.tair.db", ord=FALSE)


I hope this helps you,


   Marc



On 08/22/2013 02:07 AM, dd [guest] wrote:
> Hi all,
> I was using goProfiles package for functional analysis using a genelist of 316 Arabidopsis entrez gene IDs as shown below in the R command sessionInfo().
>
> - Read a file containing Entrez IDs and TAIR IDs.
> - Subset the Entrez IDs and converted to character vector.
> - Used the vector as genelist.
> -Used goProfiles package function basicProfile for this genelist with organism package of Arabidopsis.
>
> OUTPUT :Error in GOtermslist[[i]] : subscript out of bounds.
>
> Can somebody please help me in finding any mistake I might have done?
>
> Thanks in advance.
>
>   -- output of sessionInfo():
>
> Console output :
>
>>> a<-read.table("tair_ids to gene_ids.csv" ,header=TRUE,sep=",")
>>    
>>> b<-as.character(a[,2])
>>> head(b)
>> [1] "839235" "838362" "838961" "837091" "837455" "837543"
>>
>>> h<-basicProfile(b,idType="Entrez",onto ="ANY",level=2,orgPackage="org.At.tair.db",ord=FALSE)
>> Error in GOtermslist[[i]] : subscript out of bounds
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list