[R] lapply vs. for (was: Incrementing a counter in lapply)

Philippe Grosjean phgrosjean at sciviews.org
Wed Mar 15 23:02:47 CET 2006


the for() loop is very slow in S-PLUS. This is probably one of the 
motivation of developing the apply() family of functions (as well as the 
ugly For() loop) under this system.

Now, for() loops are much faster in R. Also, if you look at the R code 
in apply(), you will realize that there is a for() loop in it!

So, why would you prefer using apply() or the like?
1) If you write code to be run both in S-PLUS and R,

2) If you want more concise code (much "housekeeping" is done by apply() 
and co),

3) Because the apply() family is more in the phylosophy of vectorized 
calculation, that is, the favored approach in S language.

Take care, however, that the optimal approach is not just to replace 
for() loops with apply() and co, but to *rethink* completelly your 
algorithm in a vectorized way. This often ends up with a very different 
solution!
Best,

Philippe Grosjean

Gregor Gorjanc wrote:
>>From: Thomas Lumley
>>
>>>On Tue, 14 Mar 2006, John McHenry wrote:
>>>
>>>
>>>>Thanks, Gabor & Thomas.
>>>>
>>>>Apologies, but I used an example that obfuscated the question that I
>>>>wanted to ask.
>>>>
>>>>I really wanted to know how to have extra arguments in 
>>>
>>>functions that
>>>
>>>>would allow, per the example code, for something like a 
>>>
>>>counter to be 
>>>
>>>>incremented. Thomas's suggestion of using mapply 
>>>
>>>(reproduced below with 
>>>
>>>>corrections) is probably closest.
>>>
>>>It is probably worth pointing out here that the R 
>>>documentation does not 
>>>specify the order in which lapply() does the computation.
>>>
>>>If you could work out how to increment a counter (and you could, with 
>>>sufficient effort), it would not necessarily work, because the 'i'th 
>>>evaluation would not necessarily be of the 'i'th element.
>>>
>>>[lapply() does in fact start at the beginning, go on until it 
>>>gets to the 
>>>end, and then stop, but this isn't documented.   Suppose R became 
>>>multithreaded, for example....]
>>
>>The corollary, it seems to me, is that sometimes it's better to leave the
>>good old for loop alone.  It's not always profitable to turn for loops into
>>some *apply construct.  The trick is learning to know when to do it and when
>>not to.
> 
> 
> Can someone share some of this tricks with me? Up to now I have always
> done things with for loop. Just recently I started to pay attention to
> *apply* constructs and I already wanted to start implementing them
> instead of good old for, but then a stroke of lightning came from this
> thread. Based on words from Thomas, lapply should not be used for tasks
> where order is critical. Did I get this clear enough. Additionally, I
> have read notes (I lost link, but was posted on R-help, I think) from
> Thomas on R and he mentioned that it is commonly assumed that *apply* (I
> do not remember which one of *apply*) is faster than loop, but that this
> is not true. Any additional pointers to literature?
>




More information about the R-help mailing list