[R] Testing predictive power of ARIMA model
gabraham at csse.unimelb.edu.au
Sat Dec 13 13:34:06 CET 2008
Evan DeCorte wrote:
> I am trying to make estimates of the predictive power of ARIMA models estimated by the auto.arima() function.
> I am looping through a large number of time seiries and fitting ARIMA models with the following code.
> data1 <- read.csv(file = "case.csv", header = T)
> data <- data1
> output = c(1:length(data))
> for(i in 1:length(data))
> point_data = unlist(data[i], use.names = FALSE)
> x = auto.arima(point_data , max.p = 10, max.q = 10, max.P = 0, max.Q = 0, approximation = TRUE)
> However, I would like to find a way to test the out of sample predictive power of these models. I can think of a few ways I MIGHT be able to do this but nothing clean. I am a recen R user and despite my best efforts (looking on the mailing list, reading documentation) I cant figure out the best way to do this.
> I tried including something like this:
> output[i] = cor(model_data, real_data)
> but with poor results.
> Does anyone have any tricks to calculate the R^2 or an ARIMA model. Sample code would be apreciated.
There are a couple of issues here.
First, how to measure the predictive power of the model. I think a
reasonable measure is mean-square error, i.e., predict ahead some k time
steps, and compare that prediction with the observed timeseries. You can
plot the MSE versus the forecast horizon. You can also calculate the
proportion of explained variance MSE/Var(x).
Second, the "out of sample" issue. You can use cross-validation or the
moving-blocks bootstrap, in essence cutting the timeseries into separate
blocks for training and testing. The blocks have to be large enough to
cover all the ARIMA terms (e.g., more than 10 time steps for an AR(10)).
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
More information about the R-help