[R] efficient code. how to reduce running time?

John Fox jfox at mcmaster.ca
Mon Jan 22 17:36:00 CET 2007


Dear Brian,


> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Prof 
> Brian Ripley
> Sent: Monday, January 22, 2007 11:06 AM
> To: Charilaos Skiadas
> Cc: John Fox; r-help at stat.math.ethz.ch
> Subject: Re: [R] efficient code. how to reduce running time?
> 
> On Mon, 22 Jan 2007, Charilaos Skiadas wrote:
> 
> > On Jan 21, 2007, at 8:11 PM, John Fox wrote:
> >
> >> Dear Haris,
> >>
> >> Using lapply() et al. may produce cleaner code, but it won't 
> >> necessarily speed up a computation. For example:
> >>
> >>> X <- data.frame(matrix(rnorm(1000*1000), 1000, 1000)) y <- 
> >>> rnorm(1000)
> >>>
> >>> mods <- as.list(1:1000)
> >>> system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
> >> [1] 40.53  0.05 40.61    NA    NA
> >>>
> >>> system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
> >> [1] 53.29  0.37 53.94    NA    NA
> >>
> > Interesting, in my system the results are quite different:
> >
> > > system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
> > [1] 192.035  12.601 797.094   0.000   0.000
> > > system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
> > [1]  59.913   9.918 289.030   0.000   0.000
> >
> > Regular MacOSX install with ~760MB memory.
> 
> But MacOS X is infamous for having rather specific speed 
> problems with its malloc, and so gives different timing 
> results from all other platforms.
> We are promised a solution in MacOS 10.5.
> 

Thanks for the clarification.

> Both of your machines seem very slow compared to mine:
> 
> > system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
>     user  system elapsed
>   11.011   0.250  11.311
> > system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
>     user  system elapsed
>   13.463   0.260  13.812
> 
> and that on a 64-bit platform (AMD64 Linux FC5).
> 

As you can see from the specs (in a previous message), my system is quite
old, which probably accounts for at least part of the difference. The ratios
of the user times for my and your system aren't too different though:

> 53.29/40.53  # mine
[1] 1.314829

> 13.463/11.011  # yours
[1] 1.222686

Regards,
 John

> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list