[BioC] substr on XStringSet-class

Tim Homan [guest] guest at bioconductor.org
Sat Dec 21 00:52:01 CET 2013


I would like to get all the substrings of a patternmatch on a XStringSet-class. I now use the following code, but this ignores multiple matches and I have the feeling there is a better way to do it that uses biostrings fuctions.

I load a fastafile into a XStringSet-class object and then search for a specific string using the vmatchPattern function:

genes <- readDNAStringSet(File = "filename", format = "fasta", use.names = T)
view <- vmatchPattern(pattern = "CCGGA", genes)
matches <- unlist(view, recursive = T, use.names = T)
m <- as.matrix(matches)

I retrieve a substring starting at the match and 20 positions upward:

subseq(genes[rownames(m),], start = m[rownames(m),1], width = 20)

What is a better way to do this that includes all possible matches?

 -- output of sessionInfo(): 


Sent via the guest posting facility at bioconductor.org.

More information about the Bioconductor mailing list