[R] Splitting the string at the last sub-string

Tuszynski, Jaroslaw W. JAROSLAW.W.TUSZYNSKI at saic.com
Thu Sep 15 17:00:17 CEST 2005


Thanks for suggestions. I suspect the "regexpr" version will be better than
my version, since I use it to find an string towards the end of a large (up
to ~30Mb) test/XML file.

Thanks again.

Jarek
====================================================\==== 
 Jarek Tuszynski, PhD.                           o / \ 
 Science Applications International Corporation  <\__,|  
 (703) 676-4192                                   ">  \ 
 Jaroslaw.W.Tuszynski at saic.com                     `   \ 

 

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Prof Brian Ripley
Sent: Thursday, September 15, 2005 10:43 AM
To: Barry Rowlingson
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Splitting the string at the last sub-string

On Thu, 15 Sep 2005, Barry Rowlingson wrote:

> Prof Brian Ripley wrote:
>
>>> substring(str, c(1, 26), c(25,length(str)))
>
>  nchar(str) surely?

Yes, or anything larger:  I actually tested 10000.

>  regexps can be rather slow though. Here's two functions:

But that's not the way to do this repeatedly for the same pattern. (It is
normally compiling regexps that is slow, and regexpr is vectorized.) Not
that I would call 300us `slow'.

> byRipley =
> function(str,sub){
>   lp=attr(regexpr(paste(".*",sub,sep=""),str),'match.length')
>   return(substring(str, c(1, lp+1), c(lp,nchar(str)))) }
>
> byJarek =
> function(str,sub){
>   y = unlist(strsplit(str,sub))
>   return(cbind(paste(y[-length(y)], sub,  sep="", collapse = ""),
> y[length(y)]))
> }
>
>  and a quick test:
>
> > system.time(for(i in 1:100000){byJarek(str,sub)})
> [1] 15.55  0.10 16.06  0.00  0.00
>
> > system.time(for(i in 1:100000){byRipley(str,sub)})
> [1] 30.28  0.07 31.86  0.00  0.00
>
> Baz
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html




More information about the R-help mailing list