[R] Working with regular expression

Rui Barradas ruipbarradas at sapo.pt
Sat Jan 19 01:00:20 CET 2013


Hello,

Right, thanks for the explanation, it saved me time.

Rui Barradas

Em 18-01-2013 22:50, David Alston escreveu:
> Greetings!
>
>       I hope you don't mind, Rui Barradas, but I'd like to explain the
> regex.  Parsing it was a fun exercise!
>
>       Here's the regex broken into two parts..
>
> [[:alpha:]_]*   = match zero or more alphabet or underscore characters
> (.*)                = match zero or more characters and add them to \1
> pattern buffer
>
>
> Going character by character through the date string "asdf May 09 2009"
>
> "asdf" matches the first part
> "May 09 2009" matches the second part and is stored in the \1 pattern buffer
>
>
> The gsub command -  gsub("[[:alpha:]_]*(.*)", "\\1", Text)   -
> replaces the entire string (because this regex matches the entire
> string.. they all begin with a sequence of alphabet and/or underscore
> characters and the ".*" pattern at the end matches the rest of the
> line) with the contents of the \1 pattern buffer and stores it in the
> variable "Text".
>
>
>       If the length of the string prepended to the date is consistent
> another possible solution would be -   gsub(".{5}(.*)", "\\1", Text)
> - which would strip off the first five characters  (".{5}" matches
> five "any" characters).
>
>
> --David Alston
> "Without rules there  is no game for it is by the rules the game is defined."
>            --SOv
>
> On Fri, Jan 18, 2013 at 3:05 PM, Christofer Bogaso
> <bogaso.christofer at gmail.com> wrote:
>> Thanks Rui for your help.
>>
>> Could you also explain the underlying logic please?
>>
>> Thanks and regards,
>>
>>
>> On Sat, Jan 19, 2013 at 2:43 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>>> gsub("[[:alpha:]_]*(.*)", "\\1", Text)
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list