[R] How to speed up multiple for loop over list of data frames

jim holtman jholtman at gmail.com
Wed Oct 17 15:44:08 CEST 2007


First thing to do is to use Rprof (?Rprof) on a subset of your data to
see where time is being spent.  My guess is that most of it is in the
calls to 'cor' and if this is the case, they you have to figure out
some other algorithm.

Also if these dataframes all contain numeric information, convert them
to matrices intially because the subsetting that you are doing on the
dataframe (e.g., alist[[p]][i,"v"]) can be very expensive.  The output
from Rprof will help determine what course of action you should take.

On 10/16/07, Dieter Best <dieterbest_2000 at yahoo.com> wrote:
> Hi there,
>
>  I have a multiple for loop over a list of data frames
>
>  for ( i in 1:(N-1) ) {
>    for ( j in (i+1):N ) {
>        for ( p in 1:M ) {
>            v_i[p]    = alist[[p]][i,"v"]
>            v_j[p]    = alist[[p]][j,"v"]
>        }
>        rho_s = cor(v_i, v_j, method = "spearman")
>        rho_p = cor(v_i, v_j, method = "pearson" )
>        iv     = c( iv, min(i, j) )
>        jv     = c( jv, max(i, j) )
>        rho_sv = c( rho_sv, rho_s)
>        rho_pv = c( rho_pv, rho_p)
>    }
> }
>
>  N is of the order of 400, M about 800.
>
>  This takes me an entire day basically. Is there anything I could do to speed things up or is cor really that slow?
>
>  -- D
>
>
>
> ---------------------------------
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list