[BioC] Fw: Warning of function "ncbiTaxonomy"

Chris Stubben stubben at lanl.gov
Mon Mar 5 17:21:48 CET 2012


I sent this on Friday but I'm not sure what happened to it.   I 
apologize if this re-posts.

 >   I have a list of NCBI taxon ids for which I would like to have both 
the full lineage and common name information. So I Install the package 
called >'genomes' (genomes_2.0.0.zip),then use function 'ncbiTaxonomy' 
as followed,

 > ncbiTaxonomy (1000587, "lineage")
 >Premature end of data in tag TaxaSet line 1


The new NCBI E-Utilities updates (version 2.0 of ESummary and EFetch) 
have broken a number of functions in my genomes package including 
ncbiTaxonomy, so I decided to simplify and re-write all the NCBI 
e-utility code and separate these from the parsers.  You can find a 
complete description on GitHub at  https://github.com/cstubben/ncbi and 
I will update the genomes dev package in a few weeks once I get 
everything worked out.    It should work something like this after the 
next update...

Run einfo to see a list of search columns

einfo("taxonomy")
 Name        FullName
1   ALL      All Fields
2  ALLN       All Names
3  COMN     Common Name
4  EDAT     Entrez Date
5  FILT          Filter
6    GC              GC
7  LNGE         Lineage
8   MGC             MGC
9  NXLV      Next Level
...

and then run esearch (using the lineage field) with esummary to get all 
taxa in the lineage (I think this usually sorts phylogenetically).
esummary( esearch("Huitzilac virus[LNGE]", "taxonomy"))
     Id         Rank Division                ScientificName
1 1000587      species  viruses               Huitzilac virus
2  339351               viruses       unclassified Hantavirus
3   11598        genus  viruses                    Hantavirus
4   11571       family  viruses                  Bunyaviridae
5   35301               viruses ssRNA negative-strand viruses
6  439488               viruses                 ssRNA viruses
7   10239 superkingdom  viruses                       Viruses
8       1                                                root

In addition, any search result or list of IDs can also be passed 
directly to esummary, efetch or elink, and I was using the xml results 
from EFetch to parse the Lineage tag.

efetch("1000587,86782", db="taxonomy", retmode="xml")

<Lineage>Viruses; ssRNA viruses; ssRNA negative-strand viruses; 
Bunyaviridae; Hantavirus; unclassified Hantavirus</Lineage>"

So long story, the updated  ncbiTaxonomy(1000587, "lineage") will be 
able to get these results again in a couple weeks.  Sorry about the delay.


Chris Stubben
 

-- 

Los Alamos National Lab
BioScience Division



More information about the Bioconductor mailing list