[R] complexity of operations in R

Bert Gunter gunter.berton at gene.com
Thu Jul 19 18:32:38 CEST 2012


Thanks, Bill, but:

> system.time(z1 <-f(n=1e5,pre=TRUE))
   user  system elapsed
   0.32    0.00    0.32
> system.time(z2 <- as.list(seq_len(1e5)))
   user  system elapsed
      0       0       0
> identical(z1,z2)
[1] TRUE

So the point is not to use an R level loop at all.

-- Bert

On Thu, Jul 19, 2012 at 9:21 AM, William Dunlap <wdunlap at tibco.com> wrote:
> Preallocation of lists does speed things up.  The following shows
> time quadratic in size when there is no preallocation and linear
> growth when there is, for size in the c. 10^4 to 10^6 region:
>> f <- function(n, preallocate) { v <- if(preallocate)vector("list",n) else list() ; for(i in seq_len(n)) v[[i]] <- i ; v }
>> identical(f(17,pre=TRUE), f(17,pre=FALSE))
> [1] TRUE
>> system.time(f(n=1e4, preallocate=FALSE))
>    user  system elapsed
>   0.324   0.000   0.326
>> system.time(f(n=2e4, preallocate=FALSE)) # 2x n, 4x time
>    user  system elapsed
>   1.316   0.012   1.329
>> system.time(f(n=4e4, preallocate=FALSE)) # ditto
>    user  system elapsed
>   5.720   0.028   5.754
>>
>> system.time(f(n=1e4, preallocate=TRUE))
>    user  system elapsed
>   0.016   0.000   0.017
>> system.time(f(n=2e4, preallocate=TRUE)) # 2x n, 2x time
>    user  system elapsed
>   0.032   0.004   0.036
>> system.time(f(n=4e4, preallocate=TRUE)) # ditto
>    user  system elapsed
>   0.068   0.000   0.069
>>
>> system.time(f(n=4e5, preallocate=TRUE)) # 10x n, 10x time
>    user  system elapsed
>   0.688   0.000   0.688
>
> Above 10^6 there is some superlinearity
>> system.time(f(n=4e6, preallocate=TRUE)) # 10x n, 20x time
>    user  system elapsed
>  11.125   0.052  11.181
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
>> Behalf Of Bert Gunter
>> Sent: Thursday, July 19, 2012 9:11 AM
>> To: Hadley Wickham
>> Cc: r-help at r-project.org
>> Subject: Re: [R] complexity of operations in R
>>
>> Hadley et. al:
>>
>> Indeed. And using a loop is a poor way to do it anyway.
>>
>> v <- as.list(rep(FALSE,dotot))
>>
>> is way faster.
>>
>> -- Bert
>>
>> On Thu, Jul 19, 2012 at 8:50 AM, Hadley Wickham <hadley at rice.edu> wrote:
>> > On Thu, Jul 19, 2012 at 8:02 AM, Jan van der Laan <rhelp at eoos.dds.nl> wrote:
>> >> Johan,
>> >>
>> >> Your 'list' and 'array doubling' code can be written much more efficient.
>> >>
>> >> The following function is faster than your g and easier to read:
>> >>
>> >> g2 <- function(dotot) {
>> >>   v <- list()
>> >>   for (i in seq_len(dotot)) {
>> >>     v[[i]] <- FALSE
>> >>   }
>> >> }
>> >
>> > Except that you don't need to pre-allocate lists...
>> >
>> > Hadley
>> >
>> > --
>> > Assistant Professor / Dobelman Family Junior Chair
>> > Department of Statistics / Rice University
>> > http://had.co.nz/
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
>> biostatistics/pdb-ncb-home.htm
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list