[R] Newbie help: Data in an arma fit

p at dirac.org p at dirac.org
Tue Oct 23 05:14:22 CEST 2007


On Tue 23 Oct 07, 10:56 AM, Gad Abraham <g.abraham at ms.unimelb.edu.au> said:
> caffeine wrote:
>> I'd like to fit an ARMA(1,1) model to some data (Federal Reserve Bank
>> interest rates) that looks like:
>> ...
>> 30JUN2006, 5.05
>> 03JUL2006, 5.25
>> 04JUL2006, N                  &lt;---- here!
>> 05JUL2006, 5.25
>> ...
>> One problem is that holidays have that "N" for their data.  As a test, I
>> tried fitting ARMA(1,1) with and without the holidays deleted.  In other
>> words, I fit the above data as well as this data:
>> ...
>> 30JUN2006, 5.05
>> 03JUL2006, 5.25
>> 05JUL2006, 5.25
>> ...
>> and the ARMA coefficients came out different.   My question is: Should I
>> delete all the holidays from my data file?   What exactly does R do with 
>> the
>> "N" values in the fit for the ARMA coefficients?
>> As a related question, the weekends don't have entries (since the FRB is
>> closed on all weekends).  Does the fact that my data is not regularly 
>> spaced
>> pose a problem for ARMA fitting?
>
> A few comments:
>
> * Is the time series stationary? You can't fit ARIMA to nonstationary data.
>
> * One thing you could try is linear regression of interest rate on time and 
> indicator variables for day of week and special days like holidays. Then 
> fit an ARIMA to the regression residuals.
>
> * Any specific reason why ARMA(1,1)? Have you looked at the acf and pacf of 
> the time series?
>
> Cheers,
> Gad

Hi Gad,

This is supposed to be more of an exercise in R than fitting models.  Since
the goal is to simply learn R, we're assuming stationarity.  The series is
nearly weakly stationary, but for the purpose of this exercise, we're to
assume that it is stationary.

The choice of order is pretty arbitrary too.  Just an exercise in getting to
feel comfortable with R.

I guess my question is not so much about the FRB rates themselves but in how
R interprets data.   If I have a time series in a file:

1   .5
2   .6
3   .4
4   No data
5   .3
6   .8

Would it be appropriate to delete the line that says "No data"?  Or does R
ignore non-numerical data?

Sorry if my question was misleading!

Pete



More information about the R-help mailing list