[BioC] locate a target species in Refseq ftp directory
heyi xiao
xiaoheyiyh at yahoo.com
Fri Oct 4 18:12:40 CEST 2013
Thanks Jim, for the hint.
That’s even worse, I will have to download and work on all files now.
Heyi
--------------------------------------------
On Fri, 10/4/13, James W. MacDonald <jmacdon at uw.edu> wrote:
Subject: Re: [BioC] locate a target species in Refseq ftp directory
Cc: bioconductor at r-project.org
Date: Friday, October 4, 2013, 11:53 AM
Hi Heyi,
ftp://ftp.ncbi.nih.gov/refseq/release/release-notes/RefSeq-release61.txt
And NCBI says 'Ha ha on you - it's not by species!' For
example:
zcat vertebrate_mammalian.1.1.genomic.fna.gz | grep \> |
head
>gi|62867015|ref|NT_112066.2|NT_112066 Callithrix jacchus
genomic sequence, ENCODE region ENr231
>gi|62871432|ref|NT_108597.2|NT_108597 Papio anubis
genomic sequence, ENCODE region ENm002
>gi|62903504|ref|NT_086517.2|NT_086517 Callithrix jacchus
genomic sequence, ENCODE region ENm014
>gi|62903506|ref|NT_113343.1|NT_113343 Dasypus
novemcinctus genomic sequence, ENCODE region ENr231
>gi|62946791|ref|NT_113349.1|NT_113349 Papio anubis
genomic sequence, ENCODE region ENr323, part 2 of 2
>gi|63025534|ref|NT_091694.3|NT_091694 Otolemur garnettii
genomic sequence, ENCODE region ENm010
>gi|63145882|ref|NT_106990.3|NT_106990 Otolemur garnettii
genomic sequence, ENCODE region ENr322
>gi|64724026|ref|NT_107822.2|NT_107822 Bos taurus genomic
sequence, ENCODE region ENm002
>gi|64724078|ref|NT_107825.2|NT_107825 Bos taurus genomic
sequence, ENCODE region ENm003
>gi|64724166|ref|NT_107827.2|NT_107827 Bos taurus genomic
sequence, ENCODE region ENm004
Best,
Jim
On Friday, October 04, 2013 11:29:23 AM, heyi xiao wrote:
> Hi all,
> I am trying to extract the RNA sequences for sheep (or
Ovis aries) in Refseq ftp site. The right directory should
be vertebrate_mammalian: ftp://ftp.ncbi.nlm.nih.gov/refseq/release/vertebrate_mammalian/
> But there so many *rna* files there, all named with
some numbers, like vertebrate_mammalian.154.rna.fna.gz, not
sure which one is for my target species. Readme files
don’t really help on this. does anyone knows how to locate
the right file for a target species there?
> Heyi
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list