[BioC] locate a target species in Refseq ftp directory
heyi xiao
xiaoheyiyh at yahoo.com
Wed Oct 9 17:44:47 CEST 2013
Thanks Darin, That’s very useful too!
--------------------------------------------
On Sat Oct 5, Darin Takemoto wrote:
Hi Heyi,
One way to get the Refseq RNA sequences for a given species is to look
up the species in the Taxonomy DB of NCBI and click through to the
details page for the species of interest (in the case of Ovis aries it
is Taxonomy ID: 9940). Once there look for the Entrez records table on
the right and click on the Direct links entry for the Nucleotide
database. Then use the Advanced link to filter to only include Refseq
sequences (choose Filter and select "nuccore pubmed refseq"). When you
do this here are the results:
http://www.ncbi.nlm.nih.gov/nuccore?term=%28txid9940[Organism%3Anoexp]%29%20AND%20%22nuccore%20pubmed%20refseq%22[Filter]
To get the sequences click on "Send to:", select File, and select FASTA
(or whatever else you want) as Format, and click Create File.
Darin
On Friday, October 04, 2013 11:29:23 AM, heyi xiao wrote:
> Hi all,
> I am trying to extract the RNA sequences for sheep (or
Ovis aries) in Refseq ftp site. The right directory should
be vertebrate_mammalian: ftp://ftp.ncbi.nlm.nih.gov/refseq/release/vertebrate_mammalian/
> But there so many *rna* files there, all named with
some numbers, like vertebrate_mammalian.154.rna.fna.gz, not
sure which one is for my target species. Readme files
don’t really help on this. does anyone knows how to locate
the right file for a target species there?
> Heyi
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list