[R] Extensively slowing for(i in 1:400) statement

Ott Toomet otoomet at econ.dk
Mon Jan 13 14:31:12 CET 2003


Hi,

It sounds like a memory-related issue.  There are several caveats
which you should be aware about.


First, as you probably know, for cycles are not specially efficient in
R.  But if the calculations in the cycle are slow, then IMHO they help
to keep the code clear.  You may still consider e.g.

sapply(1:400, FUN=function(i) {s <- get.series(i); analyse.series(s) })



Second, are you making your data frame?  Perhaps the right way to do
it is

df <- vector("list", 400)
for(i in 1:400) {
 ...
 df[[i]] <- analyse.series(...)
 }
df <- as.data.frame(df)

Note, that you should define your list length before you use it.
Otherwise, I guess, if you are adding components as 
df <- data.frame(df, analyse.series...), R makes a copy of it and
thereafter deletes the old one.  It may be slow, particularily if your
objects are long.


In generaly, you need not to rm() objects after the use.  But if you
are creating many of them in the cycle, or if they are large, it may
be an advantage.  Try to understand which objects you are creating,
how big they are, how big is their memory consumption, and where
exactly your program is spending the time.  Look
system.time()  (better than date)
gc()
object.size()


There may some OS-related issues, Win95 and 98 have significantly less
capable memory management than the recent versions and Unices, AFAIK.

Have a luck!

Ott


 | From: "Kallunki, Tuomas" <Tuomas.Kallunki at sofi.fi>
 | 
 | Hello!
 | 
 | Here is what I have tried to do:
 | 
 | 1. I have 400 time series
 | 2. pull one serie at a time from ODBC
 | 3. calculate some descriptives and regressions (about 50 statistic per
 | serie)
 | 4. store the results in the data frame
 | 
 | The problem:
 | 
 | The time consumed in each loop seems to grow linearly. I used the date()
 | function for timing each loop and time spent in loop seems to grow at the
 | speed of 0.6 * i (seconds), which implies that the last loop takes 245
 | seconds (first loop: 5 seconds). For the first 40 loop the ODBC down load
 | took 0-1 seconds, so I don't belive that ODBC connection is the problem. I
 | am running it in a computer with AMD 800Mhz/512 000 KB which should be
 | enough capacity. For example simple for(i in 1:100 000) statement runs very
 | smoothly? I will probably have to do several runs and I don't have the time
 | to wait half a day per run.
 | 
 | Can anyone tell what is the problem? 
 | Is it hardware related? 
 | Can the for() statement too long?
 | Do I have use rm() statement after each object becomes useless?
 | ...?
 | 
 | Thanks,
 | 
 | Tuomas Kallunki




More information about the R-help mailing list