[R] loop over large dataset

Federico Calboli f.calboli at imperial.ac.uk
Mon Jul 4 16:22:37 CEST 2005

On 4 Jul 2005, at 15:15, Peter Dalgaard wrote:
> Your original code got lost in the threading, but that order of
> magnitude suggests that you have N^2/2 behaviour somewhere. The  
> typical
> culprit is code like
> x <- numeric(0)
> for (i in 1:N){
>   newx <- <<....>>
>   x <- c(x, newx)
> }
> in which the extension of x causes the whole thing to be reallocated
> and copied. Same thing with cbind and rbind constructs of course.

I changed my code a bit, and now the runtime is dow to less than a  
minute (from more than 24 hours). I was copying a large dataset many  
times over, when I extracted the columns I need as independet vectors  
runtime dropped like a stone.



Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

More information about the R-help mailing list