[R] MA process in panels

Wed Jan 9 00:29:35 CET 2013

Dear R users,

after some time I have picked up working on this dataset again. I have found
a way which produces reasonable results but I am not sure whether it is
truly the correct way to go.  

I am still looking for a way to estimate a panel ARMA(1,1) with
cross-sectional fixed effects (later I wish to include time fixed effects as
well -- but for the sake of simplicity, I drop these for now). Since I have
a large panel (see above), I wish to regress demeaned time series. After
trying out a lot of different approaches (mainly using the nlme-package and
the plm-package, but also trying to specify a maxLik function) without
figuring out a way to estimate the ARMA, I began thinking about the arima
function -- and how it could help me to solve my issue. 

Please find attached my thoughts on this with an example based on the
Grunfeld dataset (unfortunately I did not find a larger, more appropriate
panel dataset). If my approach is incorrect, I hope these thoughts
nevertheless help some more advanced R users to find a proper way to
estimate panel ARMAs. 

Any comment is highly appreciated! 

Thank you very much again,
Philipp Grueber 

# Data Import
library(plm)
library(lmtest)
data(Grunfeld)

tslag <- function(x, d=l)
{
  x <- as.vector(x)
  n <- length(x)
  c(rep(NA,d),x)[1:n]
}

#In a final dataset, lagging should be done for each cross-section. For the
sake of comparability of arima(...,xreg=Grunfeld$inv1,order=c(0,0,0)) with
arima(...,xreg=0,order=c(1,0,0)), let's do this:
Grunfeld$inv1<-tslag(Grunfeld$inv,d=1)

# In order to avoid differing results due to heterogenous handling of NAs:
Grunfeld<-Grunfeld[-1,]

# Create dummy variables for comparison:
mat_i<-as.data.frame.matrix(table(cbind.data.frame(N=1:nrow(Grunfeld),T=Grunfeld$firm)))[,-1]

#Now, these two should be the same, but there seems to be a convergence
problem (not enough data?).
arima_0<-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$value,Grunfeld$capital,mat_i),transform.pars=F,include.mean=T,order=c(1,0,0))
coef(arima_0)[1:4]

arima_1<-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$inv1,Grunfeld$value,Grunfeld$capital,mat_i),transform.pars=F,include.mean=T,order=c(0,0,0))
coef(arima_1)[1:4]

#I can show that they are not the same but closer using better data: 
a<-arima.sim(n = 10001,model=list(ar=0.5,order=c(1,0,0)))
b<-a[-1]
c<-tslag(a,d=1)[-1]
coef(lm(b~c))
coef(arima(x=b,include.mean=T,order=c(1,0,0)))
coef(arima(x=b,xreg=c,include.mean=T,order=c(0,0,0)))

coef(lm(b~c-1))
coef(arima(x=b,include.mean=F,order=c(1,0,0)))
coef(arima(x=b,include.mean=F,xreg=c,order=c(0,0,0)))

#Probably, this does not matter as for an ARMA(1,0) model, I would prefer to
use OLS and thus, the lm function anyway!

#Why do I want to know? Because only if these two (arima_0 and arima_1) are
the same, I would be able to use the cross-sectionally lagged series of inv,
inv_1 as the lagged endogenous variable, right?
Grunfeld$inv_1<-c()
for (i in unique(Grunfeld$firm)){
	Grunfeld$inv_1[Grunfeld$firm==i]<-tslag(Grunfeld$inv[Grunfeld$firm==i],d=1)
}
#note: for the sake of comparability, I will not do so in the following.

# Create demeaned series
y_i<-Grunfeld$inv-ave(Grunfeld$inv,index=Grunfeld$firm)
y1_i<-Grunfeld$inv1-ave(Grunfeld$inv1,index=Grunfeld$firm)
x1_i<-Grunfeld$value-ave(Grunfeld$value,index=Grunfeld$firm)
x2_i<-Grunfeld$capital-ave(Grunfeld$capital,index=Grunfeld$firm)

y_it1<-y_i-ave(y_i,index=Grunfeld$year)
y1_it1<-y1_i-ave(y1_i,index=Grunfeld$year)
x1_it1<-x1_i-ave(x1_i,index=Grunfeld$year)
x2_it1<-x2_i-ave(x2_i,index=Grunfeld$year)

#In order to simplify my calculation, I now wish to use demeaned data. I am
able to show that these two are the same: 
arima_a<-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$value,Grunfeld$capital,Grunfeld$inv1,mat_i),transform.pars=F,include.mean=T,order=c(0,0,0))
coef(arima_a)[1:4]
arima_b<-arima(x=y_i,xreg=cbind(y1_i,x1_i,x2_i),transform.pars=F,include.mean=F,order=c(0,0,0))
coef(arima_b)[1:3]

#Even though I believe I do not need a constant term (due to demeaning),
here's the test:
arima_c<-arima(x=y_i,xreg=cbind(y1_i,x1_i,x2_i),transform.pars=F,include.mean=T,order=c(0,0,0))
coef(arima_c)[1:4]

#Related with the above question is the fact, that these coefficients are
different from the following ones:
arima_d<-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$value,Grunfeld$capital,mat_i),transform.pars=F,include.mean=T,order=c(1,0,0))
coef(arima_d)[1:4]
arima_e<-arima(x=y_i,xreg=cbind(x1_i,x2_i),transform.pars=F,include.mean=F,order=c(1,0,0))
coef(arima_e)[1:3]
arima_f<-arima(x=y_i,xreg=cbind(x1_i,x2_i),transform.pars=F,include.mean=T,order=c(1,0,0))
coef(arima_f)[1:4]

#Finally, I wish to complete my ARMA by introducing the MA process.
arima_g<-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$value,Grunfeld$capital,Grunfeld$inv1,mat_i),transform.pars=F,include.mean=T,order=c(0,0,1))
coef(arima_g)[1:5]
arima_h<-arima(x=y_i,xreg=cbind(y1_i,x1_i,x2_i),transform.pars=F,include.mean=F,order=c(0,0,1))
coef(arima_h)[1:4]
arima_i<-arima(x=y_i,xreg=cbind(y1_i,x1_i,x2_i),transform.pars=F,include.mean=T,order=c(0,0,1))
coef(arima_i)[1:5]
#
arima_j<-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$value,Grunfeld$capital,mat_i),transform.pars=F,include.mean=T,order=c(1,0,1))
coef(arima_j)[1:5]
arima_k<-arima(x=y_i,xreg=cbind(x1_i,x2_i),transform.pars=F,include.mean=F,order=c(1,0,1))
coef(arima_k)[1:4]
arima_l<-arima(x=y_i,xreg=cbind(x1_i,x2_i),transform.pars=F,include.mean=T,order=c(1,0,1))
coef(arima_l)[1:5]

# The coefficients of models "g" through "l" are reasonably close. Am I
right to assume that model "i" using arima(...,xreg=y1_i,order=c(0,0,1)) is
preferred, and that even though the intercept should be 0, including a
constant is safer? (note: plus I need to replace Grunfeld$inv1 by
Grunfeld$inv_1...)  

-----
____________________________________
EBS Universitaet fuer Wirtschaft und Recht
FARE Department
Wiesbaden/ Germany
http://www.ebs.edu/index.php?id=finacc&L=0
--
View this message in context: http://r.789695.n4.nabble.com/MA-process-in-panels-tp4489528p4654989.html
Sent from the R help mailing list archive at Nabble.com.