[R] Re gular Expression help

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Sat Nov 8 23:33:34 CET 2008


Gabor Grothendieck wrote:
> I suspect strapply is only relatively slow on short strings where
> it doesn't matter anyways since for long strings performance would
> likely be dominated by the underlying regexp operations.  I know that
> users are using the package for very long strings since I once had
> to lift the 25,000 character limit since I had complaints about that.
> The expressiveness and brevity of strapply (it would be shortest if it
> were not for the length of the word simplify) offset any disadvantage
> in my view.
>   
ok, the attached tests against strings of length 30000 where the
character that matches is precisely the last one.  (gabor3 is dummy,
because i had no patience to wait over a minute...)  note that the
strapply version is still approximately an order of magnitude slower. 

with the original script and string lenght (m) set to 10000, the
strapply version is two orders of magnitude slower.

it might be that the test is poor, though -- design a smart test where
strapply wins ;)
(related to the original problem, of course.)

vQ
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pq+.r
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20081108/9e2bef82/attachment.pl>


More information about the R-help mailing list