[R] shifted window of string

David Winsemius dwinsemius at comcast.net
Tue Jun 15 07:19:23 CEST 2010


On Jun 14, 2010, at 11:46 PM, david hilton shanabrook wrote:

> basically I need to create a sliding window in a string.  a way to  
> explain this is:
>
>> v <-  
>> c 
>> ("a 
>> ","b 
>> ","c 
>> ","d 
>> ","e 
>> ","f 
>> ","g 
>> ","h 
>> ","i 
>> ","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y")
>> window <- 5
>> shift <- 2
>
> I want a matrix of characters with "window" columns filled with "v"  
> by filling a row, then shifting over "shift" and continuing to the  
> next row until "v" is exhausted.  You can assume "v" will evenly fit  
> "m"
>
> so the result needs to look like this matrix where each row is  
> shifted 2 (in this case):
>
>> m
>      [,1] [,2] [,3] [,4] [,5]
> [1,] "a"  "b"  "c"  "d"  "e"
> [2,] "c"  "d"  "e"  "f"  "g"
> [3,] "e"  "f"  "g"  "h"  "i"
> [4,] "g"  "h"  "i"  "j"  "k"
> [5,] "i"  "j"  "k"  "l"  "m"
> [6,] "k"  "l"  "m"  "n"  "o"
> [7,] "m"  "n"  "o"  "p"  "q"
> [8,] "o"  "p"  "q"  "r"  "s"
> [9,] "q"  "r"  "s"  "t"  "u"
> [10,] "s"  "t"  "u"  "v"  "w"
> [11,] "t"  "u"  "v"  "w"  "x"

I think you got the last row wrong:

 > m <- matrix(v[sapply(1:window, function(x) seq(x,
                                            (length(v)-window+x) ,
                                             by = shift))],
                  ncol=window)
 > m
       [,1] [,2] [,3] [,4] [,5]
  [1,] "a"  "b"  "c"  "d"  "e"
  [2,] "c"  "d"  "e"  "f"  "g"
  [3,] "e"  "f"  "g"  "h"  "i"
  [4,] "g"  "h"  "i"  "j"  "k"
  [5,] "i"  "j"  "k"  "l"  "m"
  [6,] "k"  "l"  "m"  "n"  "o"
  [7,] "m"  "n"  "o"  "p"  "q"
  [8,] "o"  "p"  "q"  "r"  "s"
  [9,] "q"  "r"  "s"  "t"  "u"
[10,] "s"  "t"  "u"  "v"  "w"
[11,] "u"  "v"  "w"  "x"  "y"
>
> This needs to be very efficient as my data is large, loops would be  
> too slow.  Any ideas?  It could also be done in a string and then  
> put into the matrix but I don't think this would be easier.

I'm not sure what you mean, but I think I might have done it the way  
you thought was harder. There might be a more efficient route with  
modulo arithmetic. Some elaboration of
 > v[((1:11)*4)%/%2 -1]
  [1] "a" "c" "e" "g" "i" "k" "m" "o" "q" "s" "u"
But I don't see it immediately.

>
> I will want to put this in a function:
>
> shiftedMatrix <- function(v, window=5, shift=2){...
>
> return(m)}

Left as an exercise for the reader.

-- David.



More information about the R-help mailing list