[BioC] locate a target species in Refseq ftp directory

heyi xiao xiaoheyiyh at yahoo.com
Wed Oct 9 17:44:47 CEST 2013

Thanks Darin,  That’s very useful too!

On Sat Oct 5, Darin Takemoto wrote:
Hi Heyi,

One way to get the Refseq RNA sequences for a given species is to look 
up the species in the Taxonomy DB of NCBI and click through to the 
details page for the species of interest (in the case of Ovis aries it 
is Taxonomy ID: 9940). Once there look for the Entrez records table on 
the right and click on the Direct links entry for the Nucleotide 
database. Then use the Advanced link to filter to only include Refseq 
sequences (choose Filter and select "nuccore pubmed refseq"). When you 
do this here are the results:


To get the sequences click on "Send to:", select File, and select FASTA 
(or whatever else you want) as Format, and click Create File.


 On Friday, October 04, 2013 11:29:23 AM, heyi xiao wrote:
 > Hi all,
 > I am trying to extract the RNA sequences for sheep (or
 Ovis aries) in Refseq ftp site. The right directory should
 be vertebrate_mammalian: ftp://ftp.ncbi.nlm.nih.gov/refseq/release/vertebrate_mammalian/
 > But there so many *rna* files there, all named with
 some numbers, like vertebrate_mammalian.154.rna.fna.gz, not
 sure which one is for my target species. Readme files
 don’t really help on this. does anyone knows how to locate
 the right file for a target species there?
 > Heyi
 > _______________________________________________
 > Bioconductor mailing list
 > Bioconductor at r-project.org
 > https://stat.ethz.ch/mailman/listinfo/bioconductor
 > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

 James W. MacDonald, M.S.
 University of Washington
 Environmental and Occupational Health Sciences
 4225 Roosevelt Way NE, # 100
 Seattle WA 98105-6099

More information about the Bioconductor mailing list