[R] How can I avoid nested 'for' loops or quicken the process?
gunter.berton at gene.com
Fri Dec 26 16:51:58 CET 2008
Thankyou for the clarification, Brian. This is very helpful (as usual).
However, I think the important point, which I misstated, is that whether it
be for() or, e.g. lapply(), the "loop" contents must be evaluated at the
interpreted R level, and this is where most time is typically spent. To get
the speedup that most people hope for, avoiding the loop altogether (i.e.
moving loop **and** evaluations) to C level, via R programming -- e.g. via
use of matrix operations, indexing, or built-in .Internal functions, etc. --
is the key.
Please correct me if I'm (even partially) wrong. As you know, the issue
-- Bert Gunter
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Prof Brian Ripley
Sent: Friday, December 26, 2008 12:44 AM
To: Oliver Bandel
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] How can I avoid nested 'for' loops or quicken the process?
On Thu, 25 Dec 2008, Oliver Bandel wrote:
> Bert Gunter <gunter.berton <at> gene.com> writes:
>> Good advice below! -- after all, the first rule of optimizing code is:
>> For the record (yet again), the apply() family of functions (and their
>> packaged derivatives, of course) are "merely" vary carefully written
>> loops: their main advantage is in code readability, not in efficiency
>> which may well be small or nonexistent. True efficiency gains require
>> "vectorization", which essentially moves the for() loops from interpreted
>> code to (underlying) C code (on the underlying data structures): e.g.
>> compare rowMeans() [vectorized] with ave() or apply(..,1,mean).
> The apply-functions do bring speed-advantages.
> This is not only what I read about it,
> I have used the apply-functions and really got
> results faster.
> The reason is simple: an apply-function does
> make in C, what otherwise would be done on the level of R
> with for-loops.
Not true of apply(): true of lapply() and hence sapply(). I'll leave you
to check eapply, mapply, rapply, tapply.
So the issue is what is meant by 'the apply() family of functions': people
often mean *apply(), of which apply() is an unusual member, if one at all.
[Historical note: a decade ago lapply was internally a for() loop. I
rewrote it in C in 2000: I also moved apply to C at the same time but it
proved too little an advantage and was reverted. The speed of lapply
comes mainly from reduced memory allocation: for() is also written in C.]
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
R-help at r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help