[R] Optimize jackknife code

mw-u2 at gmx.de mw-u2 at gmx.de
Mon Dec 31 19:00:13 CET 2007


Hi, 

I have the following jackknife code which is much slower than my colleagues C code. Yet I like R very much and wonder how R experts would optimize this.

I think that the for (i in 1:N_B) part is bad because Rprof() said sum() is called very often but I have no idea how to optimize it.

 
#O <- read.table("foo.dat")$V1
O <- runif(100000);

k=100 # size of block to delete
      # the jacknife block has size N-k

total_sum=sum(O);

for (k in 1:2) {

    N_B = length(O) %/% k;
    N = N_B*k; # truncate data size to multiple of k
               # data beyond O[N] is not used

    #total_sum = sum(O[1:N]) # truncate data size N (which is a multiple of k)

    delete_block_sums = rep(0, N_B);
    for (i in 1:N_B)
    {
        # calculate indizes of the block boundaries
        a = 1+k*(i-1);
        b = k*i;

        # sum of block to delete
        delete_block_sums[i] = sum(O[a:b]);                      
    }

    v = (total_sum - delete_block_sums) / (N-k)

    # The Jackknife error is given by
    # eps^2 = (N_B-1)/N_B sum( (O_J - \bar{O})^2 )
    # I don't understand the prefactor 

    #c(k, (N_B-1)**2/N_B * var(v));
    print(c(k, total_sum/N, sqrt( (N_B-1)/N_B * sum( (v-mean(v))**2 )) ));
}

--



More information about the R-help mailing list