[R] subset of string by index

Marc Schwartz marc_schwartz at me.com
Sun Aug 8 17:16:32 CEST 2010


On Aug 8, 2010, at 9:50 AM, david h shanabrook wrote:

> How can I get a substring based on the index into the string?
> 
> strM <- c("abcde", "cdefg")
> ind <- c(1,3,5)
> 
> I want to use ind to index into the strings so the result is:
> 
> strMind <- c("ace", "ceg")



Here is one way:

> apply(sapply(ind, function(i) substr(strM, i, i)), 
        1, paste, collapse = "")
[1] "ace" "ceg"


and another:

> sapply(strsplit(strM, ""), 
         function(x) paste(x[ind], collapse = ""))
[1] "ace" "ceg"


A test using a replication of strM to create a much larger source vector suggests that the second method is meaningfully faster than the second.


Yet another option would be:

> gsub("^(.).(.).(.)$", "\\1\\2\\3", strM)
[1] "ace" "ceg"


and this is the fastest of the three on a large source vector by a factor of ~10 over the second approach above.  However, it is less easily generalizable.

You might also want to look at Gabor's 'gsubfn' CRAN package to see if he is utilities there that may be relevant.

HTH,

Marc Schwartz



More information about the R-help mailing list