[R] AIC and BIC from arima()

Tue Nov 29 19:05:52 CET 2005

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Prof Brian Ripley wrote:

>> My ultimate goal is to best fit time series by comparing AICs and
>> BICs (as in Bayesian) from arima() and nnet().
> Whoa! nnet() does not do maximum likelihood fitting so AIC and BIC
> are not even defined. On the other hand, ?WWWusage has an example
> of choosing an ARIMA fit by AIC.
Thanks: I looked at it.

>> I looked at the arima.R source code, but I am afraid I do not
>> understand it. What I only miss really is the number of
>> parameters p, where: AIC = n*log(S/n) + 2*p with S the squared
>> residuals and n the number of observations. Can I get p from
>> arima() (for both non and seasonal cases) result?

> By reading the help page: coef: a vector of AR, MA and regression
> coefficients, which can be extracted by the 'coef' method. so
> length(fit$coef) will tell you how many parameters you have fitted,
> and if you read on aic: the AIC value corresponding to the
> log-likelihood. Only valid for 'method = "ML"' fits.
I saw that but that did not match my formula so I thought it was another
AIC...

> You give us no idea where you got the formula for 'AIC' from, but
> it is not that introduced by Akaike (1973, 4) and commonly used in
> time-series (and by arima()). I think you are applying a formula
> applicable to linear regression for independent observations,
> incorrectly. There are really are a lot of subtleties here, and
> although p is well-defined, n is not. Thus applying Schwarz's
> criterion (aka BIC in one of its senses) is not at all clearcut, a
> not uncommon situation with non-iid sampling.

I got the formula from the nnts package, where p is the number of weights
and n the number of fitted values in the nnet case. I naively thought I
could use such formula for both arima and nnet.

I guess that at my level of (in)competence, I could just stick to the
squared residuals for comparing arima and nnet results?
I was also thinking of, in my application, letting the user choose a
test period for fitting. For example, for a 12 month data span, the test
period would be the last month, and fitting would be done by using the
first 11 months, predicting the 12th month and comparing with the actual
data.
Or even use as criterion a weighted combination of residuals for the 11
first months and the last month?

Thank you very much for all the pointers and your patience.

- --

Jean-Luc Fontaine  http://jfontain.free.fr/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFDjJiAkG/MMvcT1qQRAjKuAKCKDJTxVCzDZpBspHg6KTY5ZoKBMACdFTAf
26/brCmF5UcRO78pWOqb7jI=
=fHLw
-----END PGP SIGNATURE-----