[BioC] How to get NCBI's gene annotation?

Marc Carlson mcarlson at fhcrc.org
Tue Mar 17 16:42:18 CET 2009


Hi Wei,

The exact same package also provides the NCBI chromosome assignments. 
If you use the CHR mapping like this you will only NCBIs annotation and
you can see how it is different from that provided by UCSC:

mget("21784", org.Mm.egCHR)


You can see where the mapping information for each mapping is coming
from by looking at the man pages:
?org.Mm.egCHR
?org.Mm.egCHRLOC


  Marc




James F. Reid wrote:
> Hi,
>
> the pointer should be for Mouse:
> ftp://ftp.ncbi.nih.gov/genomes/M_musculus/mapview/seq_gene.md.gz
> or here I believe
> ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_musculus.gene_info.gz
>
>
> The reason that the org.Mm.eg.db package is giving you two locations
> is because it uses the alignment given by UCSC of the Refseq(s) of
> your gene.
> In this particular case NM_009362 aligns with 100% identity on both
> chr5:143285577-143289234 and chr17:31298341-31301998.
> By aligning this sequence by hand using BLAT you can see that the chr5
> hit appeared as of the July 2007 assembly.
> Maybe this kind of information is worth keeping in mind.
>
> Best,
> J.
>
>
> Sean Davis wrote:
>> On Tue, Mar 17, 2009 at 1:58 AM, Wei Shi <shi at wehi.edu.au> wrote:
>>
>>> Dear list,
>>>
>>>  The annotation package "org.Mm.eg.db" provides UCSC's annotation
>>> for mouse
>>> genes. However, this annotation could sometime be different from NCBI's
>>> annotation. Below is an example:
>>>
>>> library(org.Mm.eg.db)
>>> mget("Tff1", org.Mm.egSYMBOL2EG)
>>> $Tff1
>>> [1] "21784"
>>> mget("21784", org.Mm.egCHRLOC)
>>> $`21784`
>>>       17          5
>>> -31298340 -143285576
>>>
>>>   Two chromosomal locations were found for "Tff1" which are on
>>> chromosome
>>> 17 and chromosome 5 respectively. However, this genes is only
>>> located on
>>> chromosome 17 according to NCBI Entrez gene database. Does anybody
>>> know if
>>> there is any packages or other sources which provide NCBI gene
>>> annotation? I
>>> am working on a large set of genes and NCBI does not seem to provide
>>> downloadable files which contain gene information such as chromosomal
>>> locations etc.
>>>
>>
>> Try here:
>>
>>
>> ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/mapview/
>>
>> Sean
>>
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list