[R] Estimating MA parameters through arima or through package "dlm"
Paul Gilbert
pgilbert902 at gmail.com
Sun Jan 24 16:45:22 CET 2016
(Sorry for the delay in responding to this.)
On 01/05/2016 06:00 AM, r-help-request at r-project.org wrote:
> Date: Mon, 4 Jan 2016 11:31:22 -0500
> From: Mark Leeds<markleeds2 at gmail.com>
> To: Stefano Sofia<stefano.sofia at regione.marche.it>
> Cc:"r-help at r-project.org" <r-help at r-project.org>
> Subject: Re: [R] Estimating MA parameters through arima or through
> package "dlm"
> Message-ID:
> <CAHz+bWYXanjp4ksu2SzMbFRLaKH8_ASwN07s_3eKYjeyWuAG1A at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Hi: I don't have time to look at the details of what you're doing but the
> "equivalence"
> between state space and arima ( as paul gilbert pointed out a few weeks ago
> ) is not a true equivalence.
To state this a bit more precisely, there is a true equivalence between
the input-output equivalence classes of (linear, time invariant)
state-space models with state dimension n and the input-output
equivalence classes of (linear, time invariant) ARMA models with
McMillan degree n. (In fact, the quotient spaces are diffeomorphic not
just isomorphic.) This means you should be able to get exactly
comparable results for anything that is an equivalence class invariant,
including residuals and anything calculated from residuals, such as
likelihood. Model roots and thus stability are also invariants, but you
probably will not get comparable results for most other things involving
parameters.
This is not just a "statistical equivalence" as is sometimes suggested,
it is an algebraic equivalence between quotient spaces.
However, if you estimate a state-space model and estimate an ARMA model,
there are several other things that come into play related to
estimation. Comparing the two estimated models you are unlikely to find
comparable results. (Typically ARMA estimation is more robust at finding
the best in my experience.) Even with simulation testing of estimation
starting with "true" models it is problematic to get estimated models
that are equivalent.
If you really want to see the equivalence you need to do a conversion of
a model from one form to the other. I cannot speak to dlm, but dse was
built for studying this equivalence and the users' guide has examples.
If you are really interested in this topic, I recommend section 5 of
http://www.bankofcanada.ca/1993/03/working-paper-199/,
I realize this is getting a bit old, but if you find a more up-to-date
summary I would like to hear about it. That paper also has a
demonstration that the equivalent models give results that are
comparable within numerical precision of computers, not just to some
statistical significance.
>
> if you are in an area of the parameter space that the state space
> formulation
> can't reach, then you won't get the same parameter estimates. so, what
> you're doing
> might be okay or might not be, depending on whether the state space
> formulation
> can reach that area of the parameter space.
"Can't reach" is an estimation problem. This is typically a more serious
problem with state-space models. The quotient spaces are diffeomorphic
so in theory it should be possible to reach the same solution if you
properly account for the fact that you are estimating on a smooth
manifold and not a vector space. In practice you have also to worry
about twisting of the parameter space and finite time for estimation,
and gradients that may converge toward zero near boundaries of the
manifold's charts.
> there's another state space
> formulation that is truly equivalent which is called the SSOE formulation
> or innovations representation but
> I don't know if you want to get into that. google "SSOE state space" if
> you're interested.
The quotient space of input-output equivalence classes for innovations
form models is equivalent to the quotient space of input-output
equivalence classes for non-innovations form models. You need more
information to identify a non-innovations form, typically some physical
understanding of the system. On the bases of only input-output data, and
with no additional understanding of the physical system, there would be
no reason to choose a non-innovations form for estimation. There is more
discussion of this in the above mentioned summary and in the dse user's
guide.
BTW, estimation problems tend to be much more severe with multivariate
series than with univariate series. This is not just because of the
usual issues. Especially in state-space representations, the twisting of
the parameter space seems to be especially bad.
Paul
>
>
> Mark
>
>
> On Mon, Jan 4, 2016 at 9:25 AM, Stefano Sofia <
> stefano.sofia at regione.marche.it> wrote:
>
>> >Dear list users,
>> >I want to use apply a MA(2) process (x=beta1*epsilon_(t-1) +
>> >beta2*epsilon_(t-1) + epsilon_(t)) to a given time series (x), and I want
>> >to estimate the two parameters beta1, beta2 and the variance of the random
>> >variable epsilon_(t).
>> >
>> >If I use
>> >MA2_1 <- Arima(x, order=c(0,0,2))
>> >I get the following result
>> >
>> >[1] "MA2_1"
>> >Series: x
>> >ARIMA(0,0,2) with non-zero mean
>> >
>> >Coefficients:
>> > ma1 ma2 intercept
>> > -0.0279 0.0783 5.3737
>> >s.e. 0.0667 0.0622 0.0245
>> >
>> >sigma^2 estimated as 0.1284: log likelihood=-92.63
>> >AIC=193.25 AICc=193.43 BIC=207.11
>> >[1] 0 2 0 0 1 0 0
>> >
>> > From this straightforward analysis V[epsilon]=0.1284, beta1=-0.0279 and
>> >beta2=0.0783.
>> >
>> >I also tried to use a DLM representation of ARIMA models and estimate the
>> >unknown parameters by maximum likelihood through the dlm package (in
>> >particular applying the example at section 3.2.6, page 115, of "Dynamic
>> >Linear Models with R" by Petris, Petrone and Campagnoli:
>> >
>> >arma_parameters <- function(x)
>> >{
>> > buildGap <- function(u)
>> > {
>> > gap <- dlmModARMA(ma = u[2 : 3], sigma2 = u[1])
>> > return(gap)
>> > }
>> > init <- c(0.005, 0.004, 0.003)
>> > outMLE <- dlmMLE(x, init, buildGap)
>> > dlmGap <- buildGap(outMLE$par)
>> >}
>> >
>> >and this gives:
>> >[1] "outMLE"
>> >$par
>> >[1] 1.00816794 0.02349296 0.02364788
>> >
>> >$value
>> >[1] 3089.196
>> >
>> >$counts
>> >function gradient
>> > 10 10
>> >
>> >$convergence
>> >[1] 0
>> >
>> >$message
>> >[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
>> >
>> >[1] "dlmGap"
>> >$FF
>> > [,1] [,2] [,3]
>> >[1,] 1 0 0
>> >
>> >$V
>> > [,1]
>> >[1,] 0
>> >
>> >$GG
>> > [,1] [,2] [,3]
>> >[1,] 0 1 0
>> >[2,] 0 0 1
>> >[3,] 0 0 0
>> >
>> >$W
>> > [,1] [,2] [,3]
>> >[1,] 1.00816794 0.0236848488 0.0238410337
>> >[2,] 0.02368485 0.0005564272 0.0005600964
>> >[3,] 0.02384103 0.0005600964 0.0005637899
>> >
>> >$m0
>> >[1] 0 0 0
>> >
>> >$C0
>> > [,1] [,2] [,3]
>> >[1,] 1e+07 0e+00 0e+00
>> >[2,] 0e+00 1e+07 0e+00
>> >[3,] 0e+00 0e+00 1e+07
>> >
>> >In this case
>> >V[epsilon]=W[1,1]=1.00816794
>> >beta1=W[2,1]/W[1,1]=0.02349296
>> >beta2=W[3,1]/W[1,1]=0.02364788
>> >
>> >I presume that these two approaches should give comparable results, but
>> >this does not happen.
>> >Is the model that I used correct? And does it make sense to perform this
>> >kind of comparison?
>> >
>> >This is the log of a rainfall time series (which has already been
>> >deseasonalised):
>> >[1] 6.014937 4.978801 5.654592 5.616771 5.612398 5.837147 5.121580 5.832176
>> >[9] 5.205654 5.355642 5.405376 6.257859 5.516247 5.500850 4.708629 5.482304
>> >[17] 5.689684 5.727824 4.779123 5.289277 5.217107 5.976351 4.630838
>> >5.683240
>> >[25] 5.345678 5.906179 5.605434 5.497578 5.898801 5.660875 5.111988
>> >5.571013
>> >[33] 5.949340 5.374352 4.841033 5.995706 5.661223 5.458734 4.454347
>> >5.795754
>> >[41] 5.995706 5.596939 5.399971 5.908898 5.282696 5.438514 5.528635
>> >6.022721
>> >[49] 5.524257 5.519459 4.957235 5.547518 5.080783 5.411200 5.056883
>> >5.798183
>> >[57] 5.086361 5.536547 5.220356 5.141664 5.847017 5.052417 5.734635
>> >5.340419
>> >[65] 5.724238 5.634432 5.685958 5.307773 5.817706 5.134032 4.987708
>> >5.110179
>> >[73] 5.423628 5.347108 4.859037 5.556828 5.487283 5.661223 5.732370
>> >5.469325
>> >[81] 5.726848 5.419207 5.172187 5.608006 5.130490 5.586874 5.171052
>> >5.683240
>> >[89] 4.674696 5.286245 5.342813 5.370638 5.432411 5.748118 6.355239
>> >5.557986
>> >[97] 5.399067 5.222516 5.279644 5.425390 5.540871 5.917818 5.132853
>> >5.689007
>> >[105] 5.900993 5.007296 5.102911 5.778271 5.318120 5.927726 5.066385
>> >5.716699
>> >[113] 5.511815 4.714921 5.383577 5.319100 5.269403 5.354698 5.145749
>> >5.204556
>> >[121] 5.878296 5.070161 5.441552 5.213304 5.450180 5.695750 4.893352
>> >5.425390
>> >[129] 5.682559 5.487283 4.213608 5.751620 5.432411 5.379436 5.700444
>> >5.580484
>> >[137] 5.357529 5.319100 4.532599 5.603225 5.208393 5.254888 5.017280
>> >5.349961
>> >[145] 4.374498 5.187944 5.585374 5.716370 3.561046 5.119789 5.163070
>> >5.422745
>> >[153] 5.863915 5.651436 4.762174 5.655642 4.797442 5.735927 4.911183
>> >5.240688
>> >[161] 5.148076 5.477300 4.572647 5.493473 5.437644 4.854371 4.908233
>> >4.755313
>> >[169] 5.582744 5.527841 5.613128 5.211124 5.275049 5.462984 5.016617
>> >5.981919
>> >[177] 5.566817 5.094364 5.314191 5.712742 5.299317 5.452325 4.691348
>> >5.851628
>> >[185] 5.410753 5.488938 5.660179 5.900993 5.380819 5.256453 4.781641
>> >5.531807
>> >[193] 5.497578 5.274537 4.325456 5.271973 5.077047 5.258536 5.280662
>> >5.247024
>> >[201] 5.995208 4.700480 4.991113 5.457029 5.194622 5.487283 5.197391
>> >5.747161
>> >[209] 5.842094 5.372497 5.306781 5.641907 5.565286 5.259057 5.241218
>> >4.759607
>> >[217] 4.550714 5.230574 4.470495 5.664348 4.846547 5.771130 4.823502
>> >5.598422
>> >[225] 5.627621 5.547518 5.596939 5.468482 5.536940 5.606170 5.281680
>> >5.656691
>> >[233] 5.283204 5.752255 5.192401 4.550714
>> >
>> >
>> >Thank you for your attention and your help
>> >Stefano
>> >
>> >
More information about the R-help
mailing list