[R] Regexpr with "."

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Wed Aug 13 23:14:06 CEST 2003


On 13-Aug-03 Barry Rowlingson wrote:
> Thompson, Trevor wrote:
>> I didn't see anything in the help file about "." being some kind
>> of special character.  Any idea why R is treating a decimal this
>> way in these functions?   Any suggestions how to get around this?
> 
> '.' is the regexpr character for matching any single character!
>  > regexpr("a.e", "Female.Alabama")
> [1] 4
>   To actually search for a dot, you need to 'escape' it with a 
> backslash, but of course the backslash needs escaping itself, with 
> another backslash. Luckily that backslash doesn't need escaping, 
> otherwise we would quickly run out of patience.
>  > regexpr("\\.", "Female.Alabama")
> [1] 7

It's also worth remembering the use of [], normally used to enclose
a disjunctive list of characters to match (e.g. [Aa] matches either
"A" or "a") or a range (e.g. [0-9] matches any digit). Any metacharacter
occurring within will be interpreted literally with exceptions "\"
and (for obvious reasons) "]" which must be escaped (in which case
the use of [] is redundant); -- however, "[" works!

  > regexpr("a.e", "Female.Alabama")
  [1] 4
  attr(,"match.length")
  [1] 3
  > regexpr("[.]", "Female.Alabama")
  [1] 7
  attr(,"match.length")
  [1] 1
  > regexpr("[[]", "Female[Alabama")
  [1] 7
  attr(,"match.length")
  [1] 1
  > regexpr("[\\]", "Female\\Alabama")
  [1] 7
  attr(,"match.length")
  [1] 1
  > regexpr("[\]]", "Female]Alabama")
  [1] 7
  attr(,"match.length")
  [1] 1

Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 167 1972
Date: 13-Aug-03                                       Time: 22:14:06
------------------------------ XFMail ------------------------------




More information about the R-help mailing list