[BioC] problem importing a fasta file biostrings or seqinr ?
Steve Lianoglou
mailinglist.honeypot at gmail.com
Mon Nov 9 16:45:47 CET 2009
Hi Moses,
On Nov 9, 2009, at 7:21 AM, m a wrote:
> Hello,
>
> I would like to make simple statistics on a specific DNA sequence.
> In order
> to do that a need to import a sequence with a fasta extension.
>
> http://www.ncbi.nlm.nih.gov/nuccore/9626243?report=fasta&log$=seqview
>
> After download I run the folliwing code with the package seqinr :
>
> dnafile <- system.file("sequences/seqbac.fasta", package = "seqinr")
> cc<-read.fasta(file = dnafile)
>
> cc gives me then the following vector
> ...
>
> [47764] "t" "c" "c" "c" "t"
> ......
>
> My problem is I would like now to use that vector to perform basic
> statistics eg; GC content analysis, base frequencies . I hardly see
> how ?
> For instance an histogram on my vector like hist(cc) don't work
It looks like the call through seqnir::read.fasta returns you a
character vector for the sequence? (I'm guessing, I haven't used it).
If that's the case, one way to get frequencies would be via the table
command, eg:
R> fa <- c("t", "c", "c", "c", "t", "a", "g", "a", "a", "g")
R> table(fa)
fa <- c("t", "c", "c", "c", "t", "a", "g", "a", "a", "g")
fa
a c g t
3 3 2 2
Though, I'd probably prefer using Biostrings:
> My first intention by the way was to use biostring package to import
> fasta
> file, like readFASTA(" directory",strip.desc=TRUE). But how sould I
> know
> under which directory I have to put data ? Because I ve tried few
> directories but he still do not found my data
How is it that you don't know where to find your data? I'm not sure
there's anything we can do to help you find it, so ... just find it :-)
Once you know where it is, you can pass the absolute path *of the
file* to the readFASTA function. In your example above, it looks like
you want to call "readFASTA" on a directory, which won't work.
For instance, on my computer (I'm using OS X), in order to read in
some file on my HD, I'd do:
library(Biostrings)
my.fasta <- readFASTA('/Users/stavros/Data/YeastPromoters.fa')
Does that help?
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor
mailing list