[BioC] org.*.eg.db problem

Marc Carlson mcarlson at fhcrc.org
Thu Dec 2 18:37:22 CET 2010


Hi Arne,

Martin's proposed solution will allow you to generalize the code and is
probably what you want to be doing. 

But you also have to be mindful that independent of design decisions
made by the Bioconductor team, the universe we live in simply does not
always provide annotations for each species equally.  So for example,
OMIM annotations only exist for humans (by definition), and there are
not any flybase IDs found in mouse.  You might reasonably expect that
you should always be able to find an annotation such as 'CHRLOC', for
any given species.  But even 'CHRLOC' may not exist for some species
simply because the place that generates that annotation might not have
done so for the species you are interested in.  All of these things are
well outside of Bioconductors control, and so we simply cannot guarantee
that all these mappings will be available in every situation. 

So these less generic mapping names are actually telling you something
that you need to know.  They are telling you that a kind of information
exists for a particular species.

When you write your function, you have to also consider whether or not
the mapping can even exist for the species in question.  getAnnMap()
will tell you this by throwing an error when the mapping in question
does not exist.


Hope this helps,


  Marc



On 12/01/2010 08:50 AM, Martin Morgan wrote:
> On 12/01/2010 08:48 AM, arne.mueller at novartis.com wrote:
>   
>> Hello,
>>
>> the org.*.eg.db environments cannot be used in a generic way :-( . Let's 
>> say I'm writing a function that needs an entire org.*.eg.db environment as 
>> argument, and the function doesn't care whether it's human, mouse rat or 
>> jellyfish. Inside my function I'd be required accessing the maps (e.g. for 
>> chromosomal location) without knowing the species. The problem is that you 
>> do need to know the species because the mapping names use the species 
>> abbreviation:
>>
>>     
>>> org.Mm.egCHRLOC
>>>       
>> CHRLOC map for Mouse (object of class "AnnDbMap")
>>
>> Why isn't this more generic so that one could just call egCHRLOC instead 
>> of org.Mm.egCHRLOC which makes code that uses this annotation having to 
>> know about the organism - why does it have to be be hard coded? Ideally 
>> I'd like to be able to do the following:
>>
>>     
>>> library(org.Mm.eg.db) 
>>> myGenomeAnnotationFunction(org.Mm.eg.db)  { # pass in as an environment?
>>>       
>>       # use the annotation environment to extract whatever information ...
>> }
>>
>> How would you solve this when having to work with several species (if else 
>> ... ???)
>>
>>     
>
> Hi arne --
>
> For many cases,
>
> library(annotate)
> map <- getAnnMap("CHRLOC", "org.Mm.eg.db")
>
> which will take care of loading the org package as well.
>
> Martin
>
>
>   
>>   thanks a lot for your help,
>>
>>   Arne
>>
>>
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>     
>
>



More information about the Bioconductor mailing list