[R] how to bind lists recursively

John Fox jfox at mcmaster.ca
Wed May 28 19:13:52 CEST 2008


Dear Brian,

Thanks for the explanation -- it makes sense of what I've observed in a
range of problems. I think that the bottom line is that in typical problems
involving lists, pre-allocation of the pointers won't make much
(proportional) difference to the total time.

For curiosity, I modified the example so that the list of matrices is of
length 10000 rather than 1000; repeating the problem 10 times, I got the
following average timings:

Pre-allocating the list:
user.self  sys.self   elapsed 
   26.789     0.000    26.828 

Not pre-allocating the list
user.self  sys.self   elapsed 
   27.223     0.000    27.269

Regards,
 John

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
> Behalf Of Prof Brian Ripley
> Sent: May-28-08 9:12 AM
> To: John Fox
> Cc: r-help at r-project.org; Bill.Venables at csiro.au
> Subject: Re: [R] how to bind lists recursively
> 
> On Wed, 28 May 2008, John Fox wrote:
> 
> > Dear Brian and Bill,
> >
> > Here's an interesting contrasting example (taken from this month's Help
> Desk
> > column in R News, which Bill has already seen), first verifying the
> relative
> > timings for Brian's example:
> >
> >> system.time({
> > +   a <- vector("list", 10001)
> > +   for(i in 0:10000) a[[i+1]] <- i
> > +   })
> >   user  system elapsed
> >   0.03    0.00    0.03
> >
> >> system.time({
> > +   a <- list()
> > +   for(i in 0:10000) a[[i+1]] <- i
> > +   })
> >   user  system elapsed
> >   0.47    0.02    0.49
> >
> >> system.time({
> > +   matrices <- vector(mode="list", length=1000)
> > +   for (i in 1:1000)
> > +   matrices[[i]] <- matrix(rnorm(10000), 100, 100)
> > +   })
> >   user  system elapsed
> >   2.59    0.00    2.61
> >
> >> system.time({
> > +   matrices <- list()
> > +   for (i in 1:1000)
> > +   matrices[[i]] <- matrix(rnorm(10000), 100, 100)
> > +   })
> >   user  system elapsed
> >   2.61    0.00    2.60
> >
> > Brian will, I'm sure, have the proper explanation, but I suppose that
the
> > NULL elements in the initialized list in the first example provide
> > sufficient space for the integers that eventually get stored there,
while
> > that's not the case for the matrices in the second example. I believe
that
> > the second example is more typical of what one would actually put in a
> list.
> 
> Your explanation is incorrect as a list is just a header plus a set of
> SEXP pointers, and is never used to store data directly.
> 
> The main point is that with only 1000 elements, the difference in
> re-allocating the list on your system is only going to be about (0.49 -
> 0.04)/10 ~ 0.04, too small to time accurately (especially under Windows).
> So finding 0.02 is not really contrasting.
> 
> The point that pre-allocation may only help when list allocation is
> taking an appreciable part of the time is a sound one.
> 
> Note that because of GC tuning such things are very variable.  Running
> repeatedly either version on my Linux box gets timings from about 3.8 down
> to 2.5 seconds -- and after settling down to 2.5-2.7s, the 13th run took
> 3.2s.
> 
> >
> > Regards,
> > John
> >
> > ------------------------------
> > John Fox, Professor
> > Department of Sociology
> > McMaster University
> > Hamilton, Ontario, Canada
> > web: socserv.mcmaster.ca/jfox
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org]
> > On
> >> Behalf Of Prof Brian Ripley
> >> Sent: May-28-08 2:04 AM
> >> To: Bill.Venables at csiro.au
> >> Cc: r-help at r-project.org
> >> Subject: Re: [R] how to bind lists recursively
> >>
> >> But pre-allocation still helps
> >>
> >> a <- vector("list", 10001)
> >> for(i in 0:10000) a[[i+1]] <- i
> >>
> >> gives
> >>     user  system elapsed
> >>    0.042   0.001   0.057
> >>
> >> on my system, against
> >>     user  system elapsed
> >>    0.743   0.000   1.907
> >>
> >> for Bill's 'best' solution.
> >>
> >> On Wed, 28 May 2008, Bill.Venables at csiro.au wrote:
> >>
> >>> This is somewhat subtle.
> >>>
> >>> Rolf's solution (here corrected to...)
> >>>
> >>> a <- list()
> >>> for(i in 0:10000) a[[i+1]] <- i
> >>>
> >>> is the best of the loop solutions (or at least the best I know of).
The
> >>> apparently similar
> >>>
> >>> a <- list(0)
> >>> for(i in 1:10000) a <- c(a, list(i))
> >>>
> >>> will take a lot longer, although the result is the same.  For example:
> >>>
> >>>> system.time({
> >>>    a <- list()
> >>>    for(i in 0:10000) a[[i+1]] <- i
> >>>    })
> >>>   user  system elapsed
> >>>   0.59    0.00    0.59
> >>>> system.time({
> >>>    a <- list(0)
> >>>    for(i in 1:10000) a <- c(a, list(i))
> >>>    })
> >>>   user  system elapsed
> >>>   6.87    0.00    6.89
> >>>>
> >>>
> >>> That's a factor of about 11 times as long.  The best of the lot is
> >>>
> >>> a <- as.list(0:10000)
> >>>
> >>> of course, but this has problems with generalisation, (which everyone
> >>> suspects is going to be needed here...).
> >>>
> >>> Bill Venables.
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org]
> >>> On Behalf Of Rolf Turner
> >>> Sent: Wednesday, 28 May 2008 1:02 PM
> >>> To: Daniel Yang
> >>> Cc: r-help at r-project.org
> >>> Subject: Re: [R] how to bind lists recursively
> >>>
> >>>
> >>> On 28/05/2008, at 2:43 PM, Daniel Yang wrote:
> >>>
> >>>> Dear all,
> >>>>
> >>>> I want to create a list that contains 0,1,2,3, ..., 10000 as its
> >>>> elements. I used the following code, which apparently doesn't work
> >>>> very well.
> >>>>
> >>>> a <- 0
> >>>> for(i in 1:10000) {
> >>>>    a <- list(a, i)
> >>>> }
> >>>>
> >>>> The result is not what I wanted. So how to create the bind lists
> >>>> recursively so that the last element would be the newly added one
> >>>> while the previous elements all remain the same?
> >>>
> >>> a <- list()
> >>> for(i in 1:10000) a[[i]] <- i
> >>>
> >>> (The word ``bind'' is inappropriate.)
> >>>
> >>> 	cheers,
> >>>
> >>> 		Rolf Turner
> >>
> >> --
> >> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> >> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> >> University of Oxford,             Tel:  +44 1865 272861 (self)
> >> 1 South Parks Road,                     +44 1865 272866 (PA)
> >> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> 
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list