[R] regex for "[2440810] / www.tinyurl.com/hgaco4fha3"

Bert Gunter bgunter.4567 at gmail.com
Wed Feb 21 07:18:25 CET 2018


These are always kind of fun, not least because of the variety of different
replies that "work" at least somewhat. Here's mine:

> stringa <- "[2440810] / www.tinyurl.com/hgaco4fha3"

> sub("^(.+)www\\.(.+)\\.com.+","\\1\\2",stringa)
[1] "[2440810] / tinyurl"

Note the use of doubled backslashes to escape the regex metacharacters. See
?regexp for details.

Cheers,
Bert





On Tue, Feb 20, 2018 at 9:19 PM, Omar André Gonzáles Díaz <
oma.gonzales at gmail.com> wrote:

> Hi, I need help for cleaning this:
>
> "[2440810] / www.tinyurl.com/hgaco4fha3"
>
> My desired output is:
>
> "[2440810] / tinyurl".
>
> My attemps:
>
> stringa <- "[2440810] / www.tinyurl.com/hgaco4fha3"
>
> b <- sub('^www.', '', stringa) #wanted  to get rid of "www." part. Until
> first dot.
>
> b <- sub('[.].*', '', b) #clean from ".com" until the end.
>
> b #returns ""[2440810] / www"
>
> Thank you.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list