[R] regexp capturing group in R

Christos Hatzis christos at nuverabio.com
Wed Feb 25 02:23:50 CET 2009


I don't know if there is a direct, perl-like way to capture the matches, but
here is a solution:

> mdat <- gregexpr("[[:digit:]]{8}", txt)
> dates <- mapply(function(x, y) substr(txt, x, x + y - 1), mdat[[1]],
attr(mdat[[1]], "match.length")) 
> dates
[1] "20080101" "20090224" 

-Christos

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of 
> pierre at demartines.com
> Sent: Tuesday, February 24, 2009 7:23 PM
> To: r-help at r-project.org
> Subject: [R] regexp capturing group in R
> 
> Hello,
> 
> Newbie question: how do you capture groups in a regexp in R?
> 
> Let's say I have txt="blah blah start=20080101 end=20090224".
> I'd like to get the two dates start and end.
> 
> In Perl, one would say:
> 
> my ($start,$end) = ($txt =~ /start=(\d{8}).*end=(\d{8})/);
> 
> I've tried:
> 
> txt <- "blah blah start=20080101 end=20090224"
> m <- regexpr("start=(\\d{8}).*end=(\\d{8})", filename, 
> perl=T); dates = substring(filename, m, m+attr(m,"match.length")-1);
> 
> but I get the whole matching substring...
> 
> Any idea?
> 
> ~Pierre
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
>




More information about the R-help mailing list