[R] re.findall equivalent?

Gabor Grothendieck ggrothendieck at gmail.com
Fri Apr 30 18:13:31 CEST 2010


UNIX grep selects out lines in a file and R grep similarly selects out
components of a vector of strings.    On the other hand re.findall
extracts substrings from strings. These are different concepts so
there is no logical reason to expect that these two sets of commands
behave the same. Instead, try this:

> library(gsubfn)
> text <- "=832,1*R[1]K[1]*R[2]K[1]*25%"
> pat <- "[^[[]([0-9]+[,.%]?[0-9]*)[^]]?"
> strapply(text, pat, c)[[1]]
[1] "832,1" "25%"

On Fri, Apr 30, 2010 at 11:59 AM, Albert-Jan Roskam <fomcl at yahoo.com> wrote:
> Hi,
>
> The regular expression (grep) below does not behave at all like the equivalent in Python. Also, I would be happy if somebody could tell me what the R equivalent for Python's re.findall is. The regex filters out any numbers not enclosed by square brackets, including fractions (with either comma or dot as the separator) and percentages. How should the R code below be modified so it does the same as the Python code?
>
> # python code
>>>> import re
>>>> pattern = "[^[[]([0-9]+[,.%]?[0-9]*)[^]]?"
>>>> formula = "=832.1*R[1]K[1]*R[2]K[1]*25%"
>>>> re.findall(pattern, formula)
> ['832.1', '25%']
>
> # partial R code
>> formula <- "=832,1*R[1]K[1]*R[2]K[1]*25%"
>> pattern <- "[^[[]([0-9]+[,.%]?[0-9]*)[^]]?"
>> grep(pattern, formula, value=TRUE, perl=TRUE)
> [1] "=832,1*R[1]K[1]*R[2]K[1]*25%"
>
> Thank you, and have a good weekend!
>
> Cheers!!
>
> Albert-Jan
>
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list