[BioC] trouble reading DNA stringset from keggGet function

Elliot [guest] guest at bioconductor.org
Tue Sep 10 19:55:20 CEST 2013


I am having some difficulty making fasta files out of files returned by the keggGet function in the KEGGREST package. The file returned is apparently a DNA string set, but readDNAStringSet will not process it. I've tried it with other data and with different kinds of sequences (amino acid) and received the same error message -- I'm sure I must be missing something. My R output is below. Thanks so much for any help!



 -- output of sessionInfo(): 

> genes<-keggLink("ath00906")

> head(genes)
     [,1]            [,2]            [,3]     
[1,] "path:ath00906" "ath:AT1G06820" "reverse"
[2,] "path:ath00906" "ath:AT1G08550" "reverse"
[3,] "path:ath00906" "ath:AT1G10830" "reverse"
[4,] "path:ath00906" "ath:AT1G30100" "reverse"
[5,] "path:ath00906" "ath:AT1G31800" "reverse"
[6,] "path:ath00906" "ath:AT1G52340" "reverse"

> sequences<-keggGet(genes[1:10,2],"ntseq")

> head(sequences)
  A DNAStringSet instance of length 6
    width seq                                names               
[1]  1788 ATGGATTTGTGTTTTC...AGGACACTCGCATAG ath:AT1G06820 CRT...
[2]  1389 ATGGCAGTAGCTACAC...AGGAAGGTCAGGTAG ath:AT1G08550 NPQ...
[3]   858 ATGGCGGTTTATCATC...ATTGGATTTTTATGA ath:AT1G10830 Z-I...
[4]  1770 ATGGCTTGTTCTTACA...TTAAACCAGGCTTAA ath:AT1G30100 NCE...
[5]  1788 ATGGCTATGGCCTTTC...TCTGCTCTTTCTTAA ath:AT1G31800 CYP...
[6]   858 ATGTCAACGAACACTG...AAAGTCTTCAGATGA ath:AT1G52340 ABA...

> readDNAStringSet(sequences,"fasta")
Error in .normargInputFilepath(filepath) : 
  'filepath' must be a character vector with no NAs

> class(sequences) #confirm that the input is a DNA string set
[1] "DNAStringSet"
attr(,"package")
[1] "Biostrings"

--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list