[R] How can I avoid nested 'for' loops or quicken the process?

Bert Gunter gunter.berton at gene.com
Fri Dec 26 16:51:58 CET 2008


Thankyou for the clarification, Brian. This is very helpful (as usual).

However, I think the important point, which I misstated, is that whether it
be for() or, e.g. lapply(), the "loop" contents must be evaluated at the
interpreted R level, and this is where most time is typically spent. To get
the speedup that most people hope for, avoiding the loop altogether (i.e.
moving loop **and** evaluations) to C level, via R programming -- e.g. via
use of matrix operations, indexing, or built-in .Internal functions, etc. --
is the key.

Please correct me if I'm (even partially) wrong. As you know, the issue
arises frequently.

-- Bert Gunter
Genentech

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Prof Brian Ripley
Sent: Friday, December 26, 2008 12:44 AM
To: Oliver Bandel
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] How can I avoid nested 'for' loops or quicken the process?

On Thu, 25 Dec 2008, Oliver Bandel wrote:

> Bert Gunter <gunter.berton <at> gene.com> writes:
>
>>
>> FWIW:
>>
>> Good advice below! -- after all, the first rule of optimizing code is:
>> Don't!
>>
>> For the record (yet again), the apply() family of functions (and their
>> packaged derivatives, of course) are "merely" vary carefully written
for()
>> loops: their main advantage is in code readability, not in efficiency
gains,
>> which may well be small or nonexistent. True efficiency gains require
>> "vectorization", which essentially moves the for() loops from interpreted
>> code to (underlying) C code (on the underlying data structures): e.g.
>> compare rowMeans() [vectorized] with ave() or apply(..,1,mean).
> [...]
>
> The apply-functions do bring speed-advantages.
>
> This is not only what I read about it,
> I have used the apply-functions and really got
> results faster.
>
> The reason is simple: an apply-function does
> make in C, what otherwise would be done on the level of R
> with for-loops.

Not true of apply(): true of lapply() and hence sapply().  I'll leave you 
to check eapply, mapply, rapply, tapply.

So the issue is what is meant by 'the apply() family of functions': people 
often mean *apply(), of which apply() is an unusual member, if one at all.

[Historical note: a decade ago lapply was internally a for() loop.  I 
rewrote it in C in 2000: I also moved apply to C at the same time but it 
proved too little an advantage and was reverted.  The speed of lapply 
comes mainly from reduced memory allocation: for() is also written in C.]

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list