[R] Improvement in Process time

Amelia Marsh amelia_marsh08 at yahoo.com
Tue Feb 2 13:03:02 CET 2016


Dear R forum,

I am running a Particular process 1000 times for different rates. Each time the result of the process is getting stored (appended) in a data.frame. However, the process is taking unsual time at times more than 2 hours. When I had tried to find out the reason for such a long process time, I have realized that writing a data.frame is consuming lot of time. 

Here is an extract of my code

# ---------------------------------------------------------------

tx_discounted <- read.csv('transaction_discounted.csv', na.strings='') 
tx_discounted$id <- as.character(tx_discounted$id) 

n             <- max(unique(simulated_exchange$id)) 

result	 <- NULL 
current  <- 1 
rcount   <- 0 
current1 <- 1 
rcount1  <- 0 
current2 <- 1 
rcount2  <- 0 
for (env in 0:n) { 

if (rcount == 0) rcount <- nrow(subset(simulated_interest, id==env)) 
temp		 <- current+rcount-1 
env_rates  <- simulated_interest[current:temp,] 
env_rates  <- env_rates[order(env_rates$curve, env_rates$day_count), ] 
if (rcount1 == 0)	rcount1 <- nrow(subset(simulated_exchange, id==env)) 
temp		 <- current1+rcount1-1 
exch_rates <- simulated_exchange[current1:temp,] 
if (rcount2 == 0)	rcount2 <- nrow(subset(simulated_instruments, id==env)) 
temp		 <- current2+rcount2-1 
instr_rates<- simulated_instruments[current2:temp,] 
current	 <- current+rcount 
current1	 <- current1+rcount1 
current2	 <- current2+rcount2 

curve       <- daply(env_rates, 'curve', function(x) { 
return(approxfun(x$day_count, x$rate, rule = 2)) 
}) 

# ____________________________________________________

## Actual time consumtion begins from following part

# ____________________________________________________

result <- rbind(result, ddply(tx_discounted, 'id', function(x) { 

if(!is.na(x$curve) && x$curve != '') { 
intrate <- curve[[x$curve]](x$maturity_period) 
} else { 
intrate <- subset(instr_rates, instrument==as.character(x$instrument))$value 
} 

cross_rate <- subset(exch_rates, key==paste(x$currency, x$currency_base, sep='_'))$rate 
mtm_bc     <- cross_rate * (x$amount/(1+((intrate/100)*(x$maturity_period/x$intbasis)))) 

return(data.frame(env=env, id=x$id, instrument=x$instrument, currency=x$currency, 
intrate = intrate, maturity_period = x$maturity_period,  intbasis = x$intbasis, cross_rate = cross_rate, amount=x$amount, mtm_bc=mtm_bc)) 
})) 
} 


# ---------------------------------------------------------------------------

Unfortuantely I can't share the input files. Is there any way I can improve the process time.

Regards and thanking in advance

Amelia



More information about the R-help mailing list