[R] regex help: splitting strings with no separator

Gabor Grothendieck ggrothendieck at gmail.com
Thu May 20 03:21:12 CEST 2010


One way is to use strapply in the gsubfn package.  It is like apply in
that the first argument is the object (in both cases), the second is
the modifier (the margin in the case of apply and the regular
expression in the case of strapply) and a function (in both cases).
The parenthesized expressions in the regular expression are captured
and passed to the function.  Here \\D+ is a string of non-digits and
\\d+ is a string of digits.  See http://gsubfn.googlecode.com home
page, the vignette and the help for more info.

> library(gsubfn)
> strapply(x, "(\\D+)(\\d+)", c, simplify = rbind)
     [,1]    [,2]
[1,] "Apple" "12"
[2,] "HP"    "42"
[3,] "Dell"  "91"


On Wed, May 19, 2010 at 8:15 PM, Krishna Tateneni <tateneni at gmail.com> wrote:
> Greetings,
>
> I have a vector of values that are a word followed by a number, e.g., x =
> c("Apple12","HP42","Dell91").  The goal is to split this vector into two
> vectors such that the first vector contains just the words and the second
> contains just the numbers.  I cannot use strsplit (or at least I do not know
> how) as there is no obvious separator.
>
> I can use sub to create a separator, e.g., y = sub("([[:digit:]])","
> \\1",x), and then use strsplit, but I thought more experienced R users may
> have a better solution.  I've spent some time with Google, but not turned up
> anything so far.
>
> Many thanks,
> --Krishna
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list