[R] regular expressions, sub

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Jan 27 11:49:36 CET 2006


Note that [:alpha:] is a pre-defined character class and should only be 
used inside [].  And metacharacters need to be quoted.  See ?regexp.

> f <- log(D) ~ log(N)+I(log(N)^2)+log(t)
> f1 <- deparse(f)
> f1
[1] "log(D) ~ log(N) + I(log(N)^2) + log(t)"

Now we have a string.

(f2 <- gsub("I\\((.*)\\) ", "\\1 ", f1))
[1] "log(D) ~ log(N) + log(N)^2 + log(t)"
(f3 <- gsub("(?U)log\\((.*)\\)", "ln \\1", f2, perl=TRUE))
[1] "ln D ~ ln N + ln N^2 + ln t"
(f4 <- gsub("ln ([[:alpha:]])\\^([[:digit:]])", "ln^\\2 \\1", f3))
[1] "ln D ~ ln N + ln^2 N + ln t"

That should give you some ideas to be going on with.

On Fri, 27 Jan 2006, Christian Hoffmann wrote:

> Hi,
>
> I am trying to use sub, regexpr on expressions like
>
>    log(D) ~ log(N)+I(log(N)^2)+log(t)
>
> being a model specification.
>
> The aim is to produce:
>
>    "ln D ~ ln N + ln^2 N + ln t"
>
> The variable names N, t may change, the number of terms too.
>
> I succeded only partially, help on regular expressions is hard to
> understand for me, examples on my case are rare. The help page on R-help
> for grep etc. and "regular expressions"
>
> What I am doing:
>
> (f <- log(D) ~ log(N)+I(log(N)^2)+log(t))
> (ft <- sub("","",f))   # creates string with parts of formula, how to do
> it simpler?
> (fu <- paste(ft[c(2,1,3)],collapse=" "))  # converts to one string
>
> Then I want to use \1 for backreferences something like
>
> (fv <- sub("log( [:alpha:] N  )^ [:alpha:)","ln \\1^\\2",fu))
>
> to change "log(g)^7" to "ln^7 g",
>
> and to eliminate I(): sub("I(blabla)","\\1",fv)  # I(xxx) -> xxx
>
> The special characters are making trouble, sub acceps "(", ")" only in
> pairs.

>From ?regexp

   Any metacharacter with special meaning may be quoted by preceding it
   with a backslash.  The metacharacters are '. \ | ( ) [ { ^ $ * +  ?'.


> Code for experimentation:
>
> trysub <- function(s,t,e) {
> ii<-0; for (i1 in c(TRUE,FALSE)) for (i2 in c(TRUE,FALSE)) for (i3 in
> c(TRUE,FALSE)) for (i4 in c(TRUE,FALSE))
> print(paste(ii<-ii+1,ifelse(i1,"  "," ~"),"ext",ifelse(i2,"  ","
> ~"),"perl",ifelse(i3,"  "," ~"),"fixed ",ifelse(i4,"  "," ~"),"useBytes:
> ", try(sub(s,t,e, extended=i1, perl=i2, fixed=i3,
> useBytes=i4)),sep=""));invisible(0) }
>
> trysub("I(log(N)^2)","ln n^2",fu) # A: desired result for cases
> 5,6,13..16, the rest unsubstituted
>
> trysub("log(","ln ",fu)           # B: no substitutions; errors for
> cases 1..4,7.. 12   # typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement,
> x, ignore.case, useBytes) : \n\tinvalid regular expression 'log('\n"
>
> trysub("log\(","ln ",fu)          # C: same as A
>
> trysub("log\\(","ln ",fu)         # D: no substitutions; errors for
> cases 15,16        # typical errors:
> "15 ~ext ~perl ~fixed   useBytes: Error in sub(pattern, replacement, x,
> ignore.case, extended, fixed, useBytes) : \n\tinvalid regular expression
> 'log\\('\n"
>
> trysub("log\\(([:alpha:]+)\\)","ln \1",fu) # no substitutions, no errors
> # E: typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement,
> x, ignore.case, useBytes) : \n\tinvalid regular expression
> 'log\\(([:alpha:]+)\\)'\n"
>
>
>
> Thanks for help
> Christian
>
> PS. The explanations in the documents
> -- 
> Dr. Christian W. Hoffmann,
> Swiss Federal Research Institute WSL
> Mathematics + Statistical Computing
> Zuercherstrasse 111
> CH-8903 Birmensdorf, Switzerland
>
> Tel +41-44-7392-277  (office)   -111(exchange)
> Fax +41-44-7392-215  (fax)
> christian.hoffmann at wsl.ch
> http://www.wsl.ch/staff/christian.hoffmann
>
> International Conference 5.-7.6.2006 Ekaterinburg Russia
> "Climate changes and their impact on boreal and temperate forests"
> http://ecoinf.uran.ru/conference/
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list