[R] Fast string comparison

Ralf B ralf.bierig at gmail.com
Tue Jul 13 13:38:57 CEST 2010


I see. I did not get these performances since did not directly compare
arrays but run seemingly expensive for-loops to do it iteratively...
:(

R.









On Tue, Jul 13, 2010 at 1:42 AM, Hadley Wickham <hadley at rice.edu> wrote:
> strings <- replicate(1e5, paste(sample(letters, 100, rep = T), collapse =  ""))
> system.time(strings[-1] == strings[-1e5])
> #   user  system elapsed
> #  0.016   0.000   0.017
>
> So it takes ~1/100 of a second to do ~100,000 string comparisons. You
> need to provide a reproducible example that illustrates why you think
> string comparisons are slow.
>
> Hadley
>
>
> On Tue, Jul 13, 2010 at 6:52 AM, Ralf B <ralf.bierig at gmail.com> wrote:
>> I am asking this question because String comparison in R seems to be
>> awfully slow (based on profiling results) and I wonder if perhaps '=='
>> alone is not the best one can do. I did not ask for anything
>> particular and I don't think I need to provide a self-contained source
>> example for the question. So, to re-phrase my question, are there more
>> (runtime) effective ways to find out if two strings (about 100-150
>> characters long) are equal?
>>
>> Ralf
>>
>>
>>
>>
>>
>>
>> On Sun, Jul 11, 2010 at 2:37 PM, Sharpie <chuck at sharpsteen.net> wrote:
>>>
>>>
>>> Ralf B wrote:
>>>>
>>>> What is the fastest way to compare two strings in R?
>>>>
>>>> Ralf
>>>>
>>>
>>> Which way is not fast enough?
>>>
>>> In other words, are you asking this question because profiling showed one of
>>> R's string comparison operations is causing a massive bottleneck in your
>>> code? If so, which one and how are you using it?
>>>
>>> -Charlie
>>>
>>> -----
>>> Charlie Sharpsteen
>>> Undergraduate-- Environmental Resources Engineering
>>> Humboldt State University
>>> --
>>> View this message in context: http://r.789695.n4.nabble.com/Fast-string-comparison-tp2285156p2285409.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/
>



More information about the R-help mailing list