[BioC] readFasta and gzipped FASTA files

Ivan Gregoretti ivangreg at gmail.com
Thu Feb 14 21:32:18 CET 2013


Hello everybody,

The library ShortRead includes two very useful functions: readFastq()
and readFasta()

While readFastq() can open FASTQ files as either plain text or gzipped
files, readFasta() can only open files in plain text.

For example:

# FASTQ: success
> readFastq("t01213R0QU.fq.gz")
class: ShortReadQ
length: 43608 reads; width: 178..486 cycles

# FASTA: failure
> readFasta("t01213R0QU.P.fa.gz")
Error in .normargInputFilepath(filepath) :
  file "t01213R0QU.P.fa.gz" has unsupported type: gzfile


Is this the current status or it's time for me to update my BioC?

Can someone offer a work around that does not involve decompressing
the FASTA file to disc? I tried, yet unsuccessfully:

readFasta(gzfile("t01213R0QU.P.fa.gz","r"))
Error in function (classes, fdef, mtable)  :
  unable to find an inherited method for function ‘readFasta’ for
signature ‘"gzfile"’


Thank you,

Ivan


> sessionInfo()
R Under development (unstable) (2012-11-30 r61184)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] Rsamtools_1.11.16     Biostrings_2.27.11    GenomicRanges_1.11.29
[4] IRanges_1.17.32       BiocGenerics_0.5.6

loaded via a namespace (and not attached):
[1] bitops_1.0-5   stats4_2.16.0  tools_2.16.0   zlibbioc_1.5.0



More information about the Bioconductor mailing list