[R] Compare String Similarity

Alekseiy Beloshitskiy abeloshitskiy at velti.com
Thu Apr 19 18:11:39 CEST 2012


Thank you, Michael,

Right, I m looking for R implementation of Leventstein or or any other similar approaches. Will try it.

Thank you again!
-Alex
________________________________________
From: R. Michael Weylandt [michael.weylandt at gmail.com]
Sent: 19 April 2012 19:01
Cc: Alekseiy Beloshitskiy; r-help at r-project.org
Subject: Re: [R] Compare String Similarity

Though if you do decide to use Levenstein, it's implemented here in R:
http://finzi.psych.upenn.edu/R/library/RecordLinkage/html/strcmp.html

I'm pretty sure this is in C code so it should be mighty fast.

Michael

On Thu, Apr 19, 2012 at 11:40 AM, Bert Gunter <gunter.berton at gene.com> wrote:
> Wrong list.This is R, not statistics (or linguistics?).Please post elsewhere.
>
> -- Bert
>
> On Thu, Apr 19, 2012 at 8:05 AM, Alekseiy Beloshitskiy
> <abeloshitskiy at velti.com> wrote:
>> Dear All,
>>
>> I need to estimate the level of similarity of two strings. For example:
>> string1 <- c("depending","audience","research", "school");
>> string2 <- c("audience","push","drama","button","depending");
>>
>> The words in string may occur in different order though. What function would you recommend to use to estimate similarity (e.g., levenstein, distance)?
>>
>> Appreciate for any advices.
>>
>> -Alex
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list