[Rd] HOW TO AVOID LOOPS

Mon Apr 14 17:27:25 CEST 2008

I would be interested to see how the following approach compares to the other suggestions:

> x <- c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> test <- c(0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
> out <- Reduce( function(x,y) x*y + y, x, accumulate=TRUE )
> all.equal(out,test)
[1] TRUE

For the second question, you can do something like:

> test2 <- c(0,0,1,0,0,0,3,0,0,0,2,0,1,0,0,0,0,0,0,6)
> out2 <- out * c( out[-1]==0, 1 )
> all.equal(out2,test2)
[1] TRUE

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
(801) 408-8111

> -----Original Message-----
> From: r-devel-bounces at r-project.org 
> [mailto:r-devel-bounces at r-project.org] On Behalf Of carlos martinez
> Sent: Saturday, April 12, 2008 7:33 PM
> To: r-devel at r-project.org
> Cc: Vincent.Goulet at act.ulaval.ca
> Subject: Re: [Rd] HOW TO AVOID LOOPS
> 
> Appreciate the ingenious and effective suggestions and feedback from:
> 
> Dan Davison
> Vincent Goulet
> Martin Morgan
> Hadley Wickham
> 
> The variety of technical approaches proposes so far are clear 
> prove of the strong and flexible capabilites of the R system, 
> and specially the dynamics and technical understanding of the 
> R user base.
> 
> We tested all four recommendations with an input vector of 
> more than 850000 components, and got time-responses from 
> about 40-second to 20-seconds.
> 
> All four approches produced the desired vector. The Wickham's 
> approach produced and extra vector, but the second vector 
> included the correct format.
> 
> Just one additional follow up, to obtain from the same input vector:
> c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> 
> A vector of the following format:
> (0,0,1,0,0,0,3,0,0,0,2,0,1,0,0,0,0,0,6)
> 
> Will be easier and more efficient to start from the original 
> input vector, or start from the above second vector
> (0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
> 
> Thanks for your responses.
> 
> --------------------------------------------------------------
> -----------
> Hadley Wickham Approach
> 
> How about:
> 
> unlist(lapply(split(x, cumsum(x == 0)), seq_along)) - 1
> 
> Hadley
> --------------------------------------------------------------
> ------------
> -----Original Message-----
> From: Martin Morgan [mailto:mtmorgan at fhcrc.org]
> Sent: Saturday, April 12, 2008 5:00 PM
> To: Dan Davison
> Cc: martinezbula at earthlink.net
> Subject: Re: [Rd] HOW TO AVOID LOOPS
> 
> (anonymous 'off-list' response; some extra calcs but tidy)
> 
> > x=c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> > x * unlist(lapply(rle(x)$lengths, seq))
>  [1] 0 0 1 0 1 2 3 0 0 1 2 0 1 0 1 2 3 4 5 6
> 
> 
> Dan Davison <davison at stats.ox.ac.uk> writes:
> 
> > On Sat, Apr 12, 2008 at 06:45:00PM +0100, Dan Davison wrote:
> >> On Sat, Apr 12, 2008 at 01:30:13PM -0400, Vincent Goulet wrote:
> >> > Le sam. 12 avr. à 12:47, carlos martinez a écrit :
> >> > >> Looking for a simple, effective a minimum execution 
> time solution.
> >> > >>
> >> > >> For a vector as:
> >> > >>
> >> > >> c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> >> > >>
> >> > > To transform it to the following vector without using 
> any loops:
> >> > >
> >> > >> (0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
> >> > >>
> >> > > Appreciate any suggetions.
> >> > 
> >> > This does it -- but it is admittedly ugly:
> >> > 
> >> >  > x <- c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> >> >  > ind <- which(x == 0)
> >> >  > unlist(lapply(mapply(seq, ind, c(tail(ind, -1) - 1, 
> length(x))),
> >> > function(y) cumsum(x[y])))
> >> >   [1] 0 0 1 0 1 2 3 0 0 1 2 0 1 0 1 2 3 4 5 6
> >> > 
> >> > (The mapply() part is used to create the indexes of each 
> sequence 
> >> > in x starting with a 0. The rest is then straightforward.)
> >> 
> >> 
> >> Here's my effort. Maybe a bit easier to digest? Only one *apply so
> probably more efficient.
> >> 
> >> function(x=c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)) {
> >>     d <- diff(c(0,x,0))
> >>     starts <- which(d == 1)
> >>     ends <- which(d == -1)
> >>     x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n))
> >>     x
> >> }
> >> 
> >
> > Come to think of it, I suggest using the existing R function rle(), 
> > rather
> than my dodgy substitute.
> >
> > e.g.
> >
> > g <- function(x=c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)) {
> >
> >     runs <- rle(x)
> >     runlengths <- runs$lengths[runs$values == 1]
> >     x[x == 1] <- unlist(lapply(runlengths, function(n) 1:n))
> >     x
> > }
> >
> > Dan
> >
> > p.s. R-help would perhaps have been more appropriate than R-devel
> >
> >
> >> Dan
> >> 
> >> 
> >> > 
> >> > HTH
> >> > 
> >> > ---
> >> >    Vincent Goulet, Associate Professor
> >> >    École d'actuariat
> >> >    Université Laval, Québec
> >> >    Vincent.Goulet at act.ulaval.ca   http://vgoulet.act.ulaval.ca
> >> > 
> >> > ______________________________________________
> >> > R-devel at r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> --
> Martin Morgan
> Computational Biology / Fred Hutchinson Cancer Research 
> Center 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
> 
> Location: Arnold Building M2 B169
> Phone: (206) 667-2793
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>