[R] regular expression question

Dirk Eddelbuettel edd at debian.org
Sun Jun 11 23:58:36 CEST 2006


On 11 June 2006 at 14:35, Jeff Newmiller wrote:
| >>gsub("(\\d*)$","",c("AAL123", "XELB245", "A247", "FOO123BAR"), perl=TRUE)
| > 
| > [1] "AAL"       "XELB"      "A"         "FOO123BAR"
| > 
| > 
| > gsub finds what is described by the first regexp [ here (\\d\*)$ --- any
| > sequence of digits before the end-of-line ] and applies the second regexp 
| > [ here an empty string as we simply delete ] to the third argument.
| > 
| > Note 
| >  - how the $ symbol $ \b prevents it from eating the non-final digits
| >    in the counter example FOO123BAR
| >  - how the \d for digits needs escaped backslashes \\d
| >  - how the * char denotes '1 or more of the preceding thingie'
| 
| * normally means "zero or more of the preceding thingie"
| + is the "1 or more or the preceding thingie"
| 
| The difference would be apparent if the string being inserted was not
| empty.
| 
|  > gsub("(\\d*)$","new",c("AAL123", "XELB245", "A247", "FOO123BAR"), perl=TRUE)
| [1] "AALnew"       "XELBnew"      "Anew"         "FOO123BARnew"
| 
|  > gsub("(\\d+)$","new",c("AAL123", "XELB245", "A247", "FOO123BAR"), perl=TRUE)
| [1] "AALnew"    "XELBnew"   "Anew"      "FOO123BAR"

Thanks for catching, and correcting, that.

Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
                                                  -- Thomas A. Edison



More information about the R-help mailing list