[R] Speeding up a bootstrap routine

babelproofreader babelproofreader at gmail.com
Mon Aug 10 23:43:35 CEST 2009


I have written the R code below to perform White's Data Mining Reality Check
(DMRC) but as it stands at the moment it is painfully slow. It is written as
a function as I call it many times from a script file with different data
input, and the output is sunk() to a text file. Could anyone suggest
improvements to the code to increase its speed?

boot_white_test <- function(data) {

detrendedreturns <- data[,1]; # creates a separate column vector of
detrended returns (preparation for multiplication that follows)
posvector1 <- data[,2]; # creates a column of position vectors for smoothed
price
posvector2 <- data[,3]; # creates a column of position vectors for 2 bar
prediction
posvector3 <- data[,4]; # creates a column of position vectors for 5 bar
prediction

actualreturns1 <- detrendedreturns*posvector1;
actualreturns2 <- detrendedreturns*posvector2;
actualreturns3 <- detrendedreturns*posvector3;
average_daily_return1 <- mean(actualreturns1);
average_daily_return2 <- mean(actualreturns2);
average_daily_return3 <- mean(actualreturns3);

# create zero centred sampling distributions for the null hypothesis
zerocentredreturns1 <- actualreturns1-average_daily_return1;
zerocentredreturns2 <- actualreturns1-average_daily_return2;
zerocentredreturns3 <- actualreturns1-average_daily_return3;

n <- length(detrendedreturns);
result1 <- 0.0; # initialise result
result2 <- 0.0; # initialise result
result3 <- 0.0; # initialise result

# create matrices to hold sampling returns
matrix_1 <- matrix(0,1,n)
matrix_2 <- matrix(0,1,n)
matrix_3 <- matrix(0,1,n)
datevector <- 1:n # create vector for the actual "date sampling"

# the bootstrap routine, placing results into the above results matrices
for(i in 1:5000) {
date_sample <- datevector[sample(n,n,replace=TRUE)] 
 
 for(j in 1:n) {
  matrix_1[j] <- zerocentredreturns1[date_sample[j]]
  matrix_2[j] <- zerocentredreturns2[date_sample[j]]
  matrix_3[j] <- zerocentredreturns3[date_sample[j]]
  x <- mean(matrix_1)
  y <- mean(matrix_2)
  z <- mean(matrix_3)
  max_boot_return <- max(x,y,z)

  if (max_boot_return>=average_daily_return1) result1 <- result1+1 # create
"p values"
  if (max_boot_return>=average_daily_return2) result2 <- result2+1 # create
"p values"
  if (max_boot_return>=average_daily_return3) result3 <- result3+1 # create
"p values"   
 }

}
-- 
View this message in context: http://www.nabble.com/Speeding-up-a-bootstrap-routine-tp24908001p24908001.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list