[BioC] locate a target species in Refseq ftp directory

Darin Takemoto darint at gmail.com
Sat Oct 5 19:06:25 CEST 2013


Hi Heyi,

One way to get the Refseq RNA sequences for a given species is to look 
up the species in the Taxonomy DB of NCBI and click through to the 
details page for the species of interest (in the case of Ovis aries it 
is Taxonomy ID: 9940). Once there look for the Entrez records table on 
the right and click on the Direct links entry for the Nucleotide 
database. Then use the Advanced link to filter to only include Refseq 
sequences (choose Filter and select "nuccore pubmed refseq"). When you 
do this here are the results:

http://www.ncbi.nlm.nih.gov/nuccore?term=%28txid9940[Organism%3Anoexp]%29%20AND%20%22nuccore%20pubmed%20refseq%22[Filter]

To get the sequences click on "Send to:", select File, and select FASTA 
(or whatever else you want) as Format, and click Create File.

Darin

On Friday, October 04, 2013 11:29:23 AM, heyi xiao wrote:
> Hi all,
> I am trying to extract the RNA sequences for sheep (or Ovis aries) in Refseq ftp site. The right directory should be vertebrate_mammalian:ftp://ftp.ncbi.nlm.nih.gov/refseq/release/vertebrate_mammalian/
> But there so many*rna*  files there, all named with some numbers, like vertebrate_mammalian.154.rna.fna.gz, not sure which one is for my target species. Readme files don?t really help on this. does anyone knows how to locate the right file for a target species there?
> Heyi



More information about the Bioconductor mailing list