[R] R newbie: how to replace string/regular expression

Gabor Grothendieck ggrothendieck at gmail.com
Sun Nov 2 13:55:00 CET 2008


Your gsub example is almost exactly what gsubfn in the gsubfn package
does.  gsubfn like gsub except the replacement string is a function:

> library(gsubfn)
> gsubfn("(.*)B$", ~ as.numeric(x) * 10e6, d, ignore.case = TRUE)
[1] "120.0M"    "11.01m"    "2.097e+09" "100.00k"   "50"

Also there are examples very similare to this

1. at the end of section 2 of
vignette("gsubfn")

2. in
demo("gsubfn-si")

Also see the gsubfn home page:
http://gsubfn.googlecode.com

Also note that if you want to return the values rather than
transform and reinsert them then strapply in the same package
can do that.

On Sun, Nov 2, 2008 at 3:43 AM, Krishna Dagli/Krushna Dagli
<krishna.dagli at gmail.com> wrote:
> Hello;
>
> I am a R newbie and would like to know correct and efficient method for
> doing string replacement.
>
> I have a large data set, where I want to replace character "M", "b",
> and "K" (currency in Million, Billion and K) to  millions.  That is
> 209.7B with (209.7 * 10e6) and 100.00K with (100.00 *1/100)
> and etc..
>
> d <- c("120.0M", "11.01m", "209.7B", "100.00k", "50")
>
> This works that is it removes "b/B",
>
> gsub ("(.*)(B$)", "\\1", d, ignore.case=T, perl=T)
>
> but
>
> gsub ("(.*)(B$)", as.numeric("\\1") * 10e6, d, ignore.case=T, perl=T)
>
> does not work. I tried with sprintf and other combination of as.numeric but
> that fails, how to use \\1 and multiply with 10e6??
>
> The other solution is :
>
> location <- grep ("M", d, ignore.case=T)
> y <- sub("M", "", d, ignore.case=T)
> y[location]<-y[location] * 10e6
>
> Is the second solution faster or (if) combination of grep along with
> multiply (if it works) is faster? Or what is the most efficient method
> to do something like this in R?
>
> Thanks and Regards
> Krishna
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list