[R] Testing predictive power of ARIMA model

Sat Dec 13 13:34:06 CET 2008

Evan DeCorte wrote:
> I am trying to make estimates of the predictive power of ARIMA models estimated by the auto.arima() function. 
> 
> I am looping through a large number of time seiries and fitting ARIMA models with the following code. 
> 
> data1 <- read.csv(file = "case.csv", header = T)
> data <- data1
> 
> output = c(1:length(data))
> 
> for(i in 1:length(data))
> {
> 	point_data = unlist(data[i], use.names = FALSE)
> 	x = auto.arima(point_data , max.p = 10, max.q = 10, max.P = 0, max.Q = 0, approximation = TRUE)
> 
> }
> 
> However, I would like to find a way to test the out of sample predictive power of these models. I can think of a few ways I MIGHT be able to do this but nothing clean. I am a recen R user and despite my best efforts (looking on the mailing list, reading documentation) I cant figure out the best way to do this. 
> 
> I tried including something like this:
> 
> output[i] = cor(model_data, real_data)
> 
> but with poor results.
> 
> Does anyone have any tricks to calculate the R^2 or an ARIMA model. Sample code would be apreciated.

There are a couple of issues here.

First, how to measure the predictive power of the model. I think a 
reasonable measure is mean-square error, i.e., predict ahead some k time 
steps, and compare that prediction with the observed timeseries. You can 
plot the MSE versus the forecast horizon. You can also calculate the 
proportion of explained variance MSE/Var(x).

Second, the "out of sample" issue. You can use cross-validation or the 
moving-blocks bootstrap, in essence cutting the timeseries into separate 
blocks for training and testing. The blocks have to be large enough to 
cover all the ARIMA terms (e.g., more than 10 time steps for an AR(10)).

-- 
Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham