[R] regexpr - ignore all special characters and punctuation in a string

John McKown john.archie.mckown at gmail.com
Mon Apr 20 16:11:53 CEST 2015


On Mon, Apr 20, 2015 at 8:59 AM, Dimitri Liakhovitski <
dimitri.liakhovitski at gmail.com> wrote:

> Hello!
>
> Please point me in the right direction.
> I need to match 2 strings, but focusing ONLY on characters, ignoring
> all special characters and punctuation signs, including (), "", etc..
>
> For example:
> I want the following to return: TRUE
>
> "What a nice day today! - Story of happiness: Part 2." ==
>    "What a nice day today: Story of happiness (Part 2)"
>
>
> --
> Thank you!
> Dimitri Liakhovitski
>
>
>
​Perhaps a variation on:

> str1<-"What a nice day today! - Story of happiness: Part 2."
> str2<- "What a nice day today: Story of happiness (Part 2)"
> gsub('[^[:alpha:]]','',str1)==gsub('[^[:alpha:]]','',str2)
[1] TRUE
>

The gsub() removes all characters which are not alphabetic from each string
and then compares them.​


-- 
If you sent twitter messages while exploring, are you on a textpedition?

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

	[[alternative HTML version deleted]]



More information about the R-help mailing list