[R] Simulating SVAR Data

Downey, Patrick PDowney at urban.org
Wed Jun 1 15:12:53 CEST 2011


Hello,

I'd like to simulate data according to an SVAR model in order to
demonstrate how other techniques (such as arima) yield biased estimates. I
am interested in a 2 variable SVAR with 2 lags (in the notation of the vars
vignette, K = 2, P = 2, where B = I_K). I'm using the {vars} package
outlined here:
http://cran.r-project.org/web/packages/vars/vignettes/vars.pdf

I thought that the following would generate the data and demonstrate the
accuracy of an SVAR compared to an arima, but the results are not what I
expected. I think the problem is the way that I'm generating the data, but
I don't understand what I could be doing wrong. Any guidance would be
greatly appreciated.

Problems:

In the code below, the array "means" should show that SVAR parameters are
unbiased estiamtors, so the second column of "means" should be near 0, and
it's not any closer than the first column (the arima results). So my
results are no less biased than the arima results.

The first pair of plots should show the same: SVAR results are unbiased.
But they don't cluster around the red dots (true parameter values), so they
aren't.

Finally, the second pair of plots should show the parameters are
consistent: that MSE decreases as sample size increases. They don't really
show that. Perhaps they would if the smallest sample were smaller than 200,
but the SVAR tends not to converge with fewer observations. Further, the
MSE tends to be at least as high with the SVAR compared to arima, so it's
not any more accurate.

Program:

##### Model #####
# Y(t) = a0 + a1*Y(t-1) + a2*Y(t-2) + a3*X(t-1) + a4*X(t-2) + e(t)
# X(t) = b0 + b1*X(t-1) + b2*X(t-2) + b3*Y(t-1) + b4*Y(t-2) + b5*Y(t) +
d(t)
# e(t) & d(t) ~ N(0,s)
# So Y has a contemporaneous impact on X
# X only has an impact on future Ys
# So this is the setup of a SVAR

##### Choosing parameters #####

# Currently, all parameters are just random numbers less than 1 so that 
# it's a stationary series
# The standard deviations of the error terms are also random ~ U(0,2)

a0 <- runif(1,-1,1)
a1 <- runif(1,-0.9,0.9)
a2 <- runif(1,-abs(a1),abs(a1))
a3 <- runif(1,-0.9,0.9)
a4 <- runif(1,-abs(a3),abs(a3))
s.e <- runif(1,0,2)

b0 <- runif(1,-1,1)
b1 <- runif(1,-0.9,0.9)
b2 <- runif(1,-abs(b1),abs(b1))
b3 <- runif(1,-0.9,0.9)
b4 <- runif(1,-abs(b3),abs(b3))
b5 <- runif(1,-0.9,0.9)
s.d <- runif(1,0,2)


################### A Formal Test: Loop #####################

Z <- 100
error <- array(0,dim=c(Z,2,6))
N <- runif(Z,200,1200)
# Let the sample size vary so that we can check for consistency


##### Generating the data #####

# Start with 2 initial values and then create a dataset with 1200
observations

X <- rep(0,1202)
Y <- rep(0,1202)

Y[1] <- rnorm(1, a0 + a1*a0 + a2*a0 + a3*b0 + a4*b0, s.e)
X[1] <- rnorm(1, b0 + b1*b0 + b2*b0 + b3*a0 + b4*a0 + b5*Y[1], s.d)

Y[2] <- rnorm(1, a0 + a1*Y[1] + a2*a0 + a3*X[1] + a4*b0, s.e)
X[2] <- rnorm(1, b0 + b1*X[1] + b2*b0 + b3*Y[1] + b4*a0 + b5*Y[2], s.d)

for(t in 3:1202){
 Y[t] <- rnorm(1, a0 + a1*Y[t-1] + a2*Y[t-2] + 
         a3*X[t-1] + a4*X[t-2], s.e)
 X[t] <- rnorm(1, b0 + b1*X[t-1] + b2*X[t-2] + 
         b3*Y[t-1] + b4*Y[t-2] + b5*Y[t], s.d)
}

L1.Y <- rep(NA,1202)
L2.Y <- rep(NA,1202)
for(t in 2:1202){
 L1.Y[t] <- Y[t-1]
 L2.Y[t] <- L1.Y[t-1]
}



for(z in 1:Z){

n <- N[z]
x <- X[3:(n+2)]
y <- Y[3:(n+2)]
L1.y <- c(NA,L1.Y[4:(n+2)])
L2.y <- c(NA,NA,L2.Y[5:(n+2)])


##### Modeling x with inclusion of y #####

m2 <- arima(x,order=c(2,0,0),xreg=cbind(L1.y,L2.y,y))
error[z,1,1] <- m2$coef[3] - b0
error[z,1,2:3] <- m2$coef[1:2] - c(b1,b2)
error[z,1,4:6] <- m2$coef[4:6] - c(b3,b4,b5)

##### SVAR of x and y #####

m3 <- VAR(cbind(y,x),p=2)
A <- matrix(c(1,NA,0,1),ncol=2)
m4 <- SVAR(m3,Amat=A)

error[z,2,1] <- m4$var$varresult$x$coeff[5] - b0
error[z,2,2] <- m4$var$varresult$x$coeff[2] - b1
error[z,2,3] <- m4$var$varresult$x$coeff[4] - b2
error[z,2,4] <- m4$var$varresult$x$coeff[1] - b3
error[z,2,5] <- m4$var$varresult$x$coeff[3] - b4
error[z,2,6] <- m4$A[2,1] - b5

}

mse <- error^2

means <- array(0,dim=c(6,2))
for(i in 1:2){
 means[,i] <- apply(error[,i,],2,mean)
}
means



par(mfrow=c(2,1))
for(i in 1:2){
 plot(1+runif(Z,-0.4,0.4),error[,i,1],xlim=c(0.5,6.5),
    xaxt='n',ylim=c(-1,1),ylab='',xlab='',main=paste("Model ",i))
  for(j in 2:6){
   points(j+runif(Z,-0.4,0.4),error[,i,j])
  }
 points(seq(1,6,1),c(b0,b1,b2,b3,b4,b5),col=2)
}
par(mfrow=c(1,1))



par(mfrow=c(2,1))
for(i in 1:2){
 plot(N,mse[,i,1],ylim=c(0,max(mse[,i,2:6])),main=paste("Model ",i))
  for(j in 2:6){
   points(N,mse[,i,j],col=j)
}}
par(mfrow=c(1,1))



More information about the R-help mailing list