[R] Performance note: Preallocating helps? and two questions

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Wed Nov 1 15:47:47 CET 2000

"Bob Sandefur" <rls at pincock.com> writes:

> hi-
>  in r 1.1 on windows 2000
>  with length(AU) of 35833
>  AUcap30<-0
>  for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) 
> took over an hour on pentium II 300 mhertz (I esc'ed before it finished)
> but
> AUcap30<-AU 
>  for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) 
> is very quick (a few seconds)
> Is this performance difference common in r (ie is linux the same way)?

Yes. The problem with the first version is that vectors are allocated
just long enough. Assigning past the end of one will stretch it, but
that involves allocating a new longer vector, copying the data into
the first N elements and then assigning the new value to the N+1st
element. So you have 35833 allocations of size 1,2,3,4,5,....,35833 in
the above code, and that is going to take some time. 

The canonical form for a loop calculating a vector element by element
would be

for (i in 1:N) # often better: "i in seq(length=N)"

> Are there other tricks to speed up R (in windows) (besides a faster processor and more memory)?


One of the more effective ones is vectorisation: Try

AUcap30 <- pmin(30,AU)

   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list