[R] Finding (Ordered Subvectors)

David Winsemius dwinsemius at comcast.net
Tue Sep 21 14:50:47 CEST 2010


On Sep 21, 2010, at 6:31 AM, Lorenzo Isella wrote:

> Dear All,
> Consider a simple example
>
> a<-c(1,4,3,0,4,5,6,9,3,4)
> b<-c(0,4,5)
> c<-c(5,4,0)
>
> I would like to be able to tell whether a sequence is contained (the  
> order of the elements does matter) in another one e.g. in the  
> example above, b is a subsequence of a, whereas c is not. Since the  
> order matters, I cannot treat the sequences above as sets (also,  
> elements are repeated).
> Does anyone know a smart way of achieving that?

 > grep(paste(c, collapse="#"), paste(a, collapse="#"))
integer(0)
 > grep(paste(b, collapse="#"), paste(a, collapse="#"))
[1] 1


Looking at that output I am wondering if you might need to also put  
markers at the ends of the arguments.
 > grep(paste("#",b,"#", collapse="#"), paste("#",a,"#", collapse="#"))
[1] 1
# To prevent a match like c(1,2,3) with c(101,2,303).

There is also an istrings package in the BioConductor repository that  
provides more extensive string matching facilities.

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list