[R] long run time for loop operation & matrix fill

Bert Gunter gunter.berton at gene.com
Thu Aug 7 23:52:30 CEST 2008


outer() trades off space for speed. It *does* vectorize calculations (=
perform the loops in the underlying C code).

The apply() family of functions (eapply,mapply and rapply are other base R
versions that you missed; there are others in packages) are basically just
efficiently written looping functions. They may or may not offer much
speedup over explicit loops. As you said, their greatest advantage is
elegance and code readability (as functional programming, rather than
procedural programming, constructs).

As you also said, vectorizing calculations is a central theme in R that
takes some getting used to. I know of no general prescriptions for how to do
it; I, too, am still learning.

Finally, please heed Roland's (and r-help's) advice: provide a small,
reproducible example if you want specific help.

-- Bert Gunter

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Roland Rau
Sent: Thursday, August 07, 2008 2:22 PM
To: rcoder
Cc: r-help at r-project.org
Subject: Re: [R] long run time for loop operation & matrix fill

Hi rcoder,

rcoder wrote:
> Hi everyone,
> 
> I'm running some code containing an outer and inner loop, to fill cells in
a
> 2500x1500 results matrix. I left my program running overnight, and it was
> still running when I checked 17 hours later. I have tested the operation
on
> a smaller matrix and it executes fine, so I believe there is nothing wrong
> with the code. I was just wondering if this is normal program execution
> speed for such an operation on a P4 with 2GB RAM?
> 

loops are not one of the strengths in R, I would say (At least not 
explicit ones). This is why many books and manuals on R devote 
considerable space on "the whole object view", vectorizing calculations, 
and general strategies how to avoid loops in R.

I (we) don't know what your actual program is doing. Probably applying a 
rather complicated function to each cell of your matrix?

I did this code:

mymatrix <- matrix(rep(0.1, 2500*1500), ncol=1500)
system.time(
for (i in 1:(nrow(mymatrix))) {
   for (j in 1:(ncol(mymatrix))) {
     mymatrix[i,j] <- i+j
   }
   if ((i %% 100)==0) cat(i,"\n")
}
)
(cat output omitted)
and it took
    user  system elapsed
  139.09   55.56  199.42

seconds.
The best strategy is usually to avoid such loops.
For example, obtaining the same results could have been achieved by:

 > system.time(
+ roland <- outer(X=1:2500, Y=1:1500, FUN=function(a,b) a+b)
+ )
    user  system elapsed
    0.25    0.09    0.34

Quite a speed-up, I would say, no? Generally using 'outer' and the apply 
family (apply, tapply, lapply, sapply -- did I forget one?) can perform 
miracles in terms of speed. And it allows also to express ideas in very 
elegant ways, in my opinion.
I have to admit, though, that it takes a while to grasp the various 
concepts (and I am also still learning).

Maybe you could supply a small, working code example as the posting 
guide suggests? This might give you more help for your specific needs.

Hope this helps,
Roland

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list