[R] Regexp subexpression

Gabor Grothendieck ggrothendieck at gmail.com
Sun Mar 26 01:57:36 CET 2006


Here are some additional variations:

> read.table(textConnection(sub(pat, '"\\1" "\\2"', patid)), as.is = TRUE)
    V1  V2
1 ALAN 334
2  AzD  44
3       NA

> read.table(textConnection(sub(pat, '"\\1" "\\2"', patid)), colClasses = "character")
    V1  V2
1 ALAN 334
2  AzD  44
3


Note that element 3,1 is the empty string and 3,2 is NA since the
which occurs since the empty string is not numeric.

On 3/25/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> Here is one more variation. This time we provide an alternative .*
> to soak up the entire expression when it would have otherwise
> failed so that the substitution occurs regardless giving us
> empty strings instead of the same string back:
>
> > pat = "^([[:alpha:]]+)([[:digit:]]+)|.*"
> > sapply(sprintf("\\%d", 1:2), sub, pattern = pat, x = patid)
>     \\1    \\2
> [1,] "ALAN" "334"
> [2,] "AzD"  "44"
> [3,] ""     ""
>
> If NAs are needed, use the same result[regexpr(pat, patid) < 0,] <- NA
> as last time.
>
> On 3/25/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> > We could use sapply to reduce it slightly:
> >
> > result <- sapply(sprintf("\\%d", 1:2), sub, pattern = pat, x = patid)
> > result[regexpr(pat, patid) < 0,] <- NA
> >
> >
> > On 3/25/06, Dieter Menne <dieter.menne at menne-biomed.de> wrote:
> > > Gabor Grothendieck <ggrothendieck <at> gmail.com> writes:
> > >
> > > >
> > > > In the third case there is no match so there are no
> > > > substitutions.  Handle it separately:
> > > >
> > > > pat = "^([[:alpha:]]+)([[:digit:]]+)"
> > > > result <- cbind(txt = sub(pat, "\\1", patid), num = sub(pat, "\\2", patid))
> > > > result[regexpr(pat, paid) < 0,] <- NA
> > > >
> > >
> > > Thanks, Gabor, that something like a compressed version of mine.  My main
> > > question was if I was missing something obvious, because I found the double sub
> > > messy. I am a surprised that there is not
> > >
> > > pat = "^([[:alpha:]]+)([[:digit:]]+)"
> > > mygrep(pat, patid)
> > >
> > > returning a list with all subexpressions.
> > >
> > > Dieter
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> > >
> >
>




More information about the R-help mailing list