[R] about ARMA(p,q) SCAN method: SAS vs. R

Steve Chen steve at stat.tku.edu.tw
Tue Apr 6 11:42:27 CEST 2010


Hi all,

I am modifying a program I wrote before to perform smallest canonical
(SCAN) correlation method for identification of ARMA(p,q) orders in Time
Series, but when I compared the output with SAS, there are some differences.

My SCAN R code can be downloaded in the following URL:

http://netstat.stat.tku.edu.tw/download/arma_scan_R.txt

I used Series_R (LA Ozone data) in Box and Jenkins(4th edition) as
example. A sample run can be done via

# ozone = scan("http://netstat.stat.tku.edu.tw/download/box_ozone.txt")
# source("http://netstat.stat.tku.edu.tw/download/arma_scan_R.txt")
# arma.scan(ozone)

First is the output of Squared Canonical Correlation Estimates:

SAS:

              Squared Canonical Correlation Estimates

Lags    MA 0    MA 1    MA 2    MA 3    MA 4    MA 5
AR 0  0.5352  0.2423  0.0696  0.0035  0.0112  0.0183
AR 1  0.0074  0.0199  0.0304  0.0399  0.0185  0.0052
AR 2  0.0173  0.0005  0.0003  0.0167  0.0123  0.0198
AR 3  0.0190  0.0003  0.0002  0.0230  0.0026  0.0287
AR 4  0.0130  0.0262  0.0214  0.0054  0.0206  0.0302
AR 5  0.0143  0.0068  0.0229  0.0230  0.0171  0.0187


My R-code:

       MA-0   MA-1     MA-2   MA-3     MA-4   MA-5
AR-0  0.5264  0.2342  0.0668  0.0033  0.0105  0.0000
AR-1  0.0080  0.0197  0.0299  0.0399  0.0183  0.0052
AR-2  0.0158  0.0005  0.0003  0.0167  0.0122  0.0198
AR-3  0.0153  0.0003  0.0002  0.0229  0.0025  0.0283
AR-4  0.0099  0.0262  0.0214  0.0054  0.0204  0.0302
AR-5  0.0116  0.0066  0.0225  0.0229  0.0174  0.0190

The results are similar. The main differences is in
the Chi-Square P-values:

SAS:

               SCAN Chi-Square[1] Probability Values

Lags    MA 0    MA 1    MA 2    MA 3    MA 4    MA 5
AR 0  <.0001  <.0001  0.0148  0.6003  0.3472  0.2307
AR 1  0.2073  0.0407  0.0164  0.0183  0.2326  0.3313
AR 2  0.0532  0.7927  0.8537  0.1190  0.1934  0.2555
AR 3  0.0435  0.8326  0.8736  0.1273  0.5318  0.0537
AR 4  0.0960  0.0356  0.1365  0.4074  0.1100  0.0910
AR 5  0.0812  0.3110  0.0288  0.0997  0.1517  0.1440

My R-code:

Chi-Square(1) Test p-value

        MA-0    MA-1    MA-2    MA-3    MA-4    MA-5
AR-0  0.0000  0.0004  0.0749  0.6971  0.4903  0.0000
AR-1  0.1880  0.2355  0.1496  0.1129  0.3625  0.5475
AR-2  0.0648  0.8515  0.9024  0.3151  0.3900  0.3592
AR-3  0.0696  0.8813  0.9112  0.2237  0.6875  0.1978
AR-4  0.1458  0.1738  0.2666  0.5628  0.2827  0.2174
AR-5  0.1168  0.4962  0.2103  0.2507  0.3148  0.3021

I check the original paper by Tsay and Tiao:

Tsay, R.S. and Tiao, G.C. (1985). Use of Canonical Analysis in Time
Series Model Identification. Biometrika,72 ,299-315.

and comapre the formula with SAS ETS manual, e.g.

http://support.sas.com/documentation/cdl/en/etsug/60372/HTML/default/etsug_arima_sect031.htm

I found that the formula of d(m,j) in SAS manual is wrong. The correct
fomula for d(m,j) should be something like

d(m,j) = 1 + 2*(r_1^2 + r_2^2 + ... + r_j^2)

but in SAS ETS manual, it is

d(m,j) = 1 + 2*(r_1 + r_2 + ... + r_(j-1))

I plan to wrap my SCAN code and some other R codes for Time Series into
a package, but with the P-value difference from SAS output, I am not
sure whether my R-code for SCAN is fine enough for real application.

Any suggestion ? Thank you in advance.

Steve Chen
Associate Professor, Department of Statistics
Tamkang University, Taiwan



More information about the R-help mailing list