[R] Loops and memory

Duncan Murdoch murdoch at stats.uwo.ca
Tue May 6 22:09:44 CEST 2003


On Tue, 06 May 2003 19:35:01 +0000, you wrote in message
<Law11-F110twoPhPVKP0002a2b8 at hotmail.com>:

>Interesting.
>
>The other day I was surprised by how much longer a for loop takes to
>add two vectors a and b compared to a + b.  (I think that I made a and
>b have a million entries.)
>
>I guess my problem is that I don't really what the issues are, I guess,
>so it's not clear to me when and where loops should be avoided.  I
>guess I should try to get a copy of this new book to find out.

I think the main issue nowadays is that your code will go much slower
if it is interpreted R code (as a for loop would be) than if it is
compiled C or Fortran code (the way "a + b" is implemented
internally).  In the old days, there was an additional penalty for
doing a loop (S tried to allocate memory each step through the loop,
but wouldn't clean up until the end), but that isn't normally a
problem in R.  

But as a general principle, I think you should always write your code
to be readable first; if it turns out to be too slow that way, then
worry about optimizing it.  As you gain experience in R you'll find
the vectorized versions of formulas more readable than the loops and
you'll need to redo things less often, but at first, loops may be the
best way to get the right answer fast.

Duncan Murdoch




More information about the R-help mailing list