[R] Improving data processing efficiency

hadley wickham h.wickham at gmail.com
Sat Jun 7 00:55:04 CEST 2008


On Fri, Jun 6, 2008 at 5:10 PM, Daniel Folkinshteyn <dfolkins at gmail.com> wrote:
> Hmm... ok... so i ran the code twice - once with a preallocated result,
> assigning rows to it, and once with a nrow=0 result, rbinding rows to it,
> for the first 20 quarters. There was no speedup. In fact, running with a
> preallocated result matrix was slower than rbinding to the matrix:
>
> for preallocated matrix:
> Time difference of 1.577779 mins
>
> for rbinding:
> Time difference of 1.498628 mins
>
> (the time difference only counts from the start of the loop til the end, so
> the time to allocate the empty matrix was /not/ included in the time count).
>
> So, it appears that rbinding a matrix is not the bottleneck. (That it was
> actually faster than assigning rows could have been a random anomaly (e.g.
> some other process eating a bit of cpu during the run?), or not - at any
> rate, it doesn't make an /appreciable/ difference.

Why not try profiling?  The profr package provides an alternative
display that I find more helpful than the default tools:

install.packages("profr")
library(profr)
p <- profr(fcn_create_nonissuing_match_by_quarterssinceissue(...))
plot(p)

That should at least help you see where the slow bits are.

Hadley

-- 
http://had.co.nz/



More information about the R-help mailing list