[R] Estimating cluster standard errors in Diff-in-Diff panel models with plm

Tue May 9 23:09:33 CEST 2017

Hi,
I want to estimate the cluster SE of a differences-in-differences
panel model with 100 groups, 6,156 individuals and 15 years. Some of
the individuals are repeated (4,201 unique) because they are part of a
matched sample obtained with a one-to-one, with replacement, matching
method.
I have been using plm to estimate the model coefficients, after
transforming my matched sample into a pdata.frame by using indivuals
and years as indexes. I have also been able to estimate the cluster
standard errors at the individual level by using the vcovHC function.
However, these individuals are clustered within the groups, and
therefore I want to cluster at this higher level of aggregation rather
than at the individual level. Unfortunately, it is not clear to me how
to proceed. Of course if I replace the individuals for groups in the
index I get repeated row.names and then I can´t estimate the panel
model with plm. I get the following error message:

Error in `row.names<-.data.frame`(`*tmp*`, value = c("1-1", "1-1",
"1-1",  : duplicate 'row.names' are not allowed

For simplicity, I make my case using the following example (copied
from: http://www.richard-bluhm.com/clustered-ses-in-r-and-stata-2/):
# load packages
require(plm)
require(lmtest)
# get data and load as pdata.frame
url <- "http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.txt"
p.df <- read.table(url)
names(p.df) <- c("firmid", "year", "x", "y")
#Introduce group (State) Id
p.df$State <- rep(1:100, each=50)
p.df2 <- pdata.frame(p.df, index = c("State", "year"), drop.index = F,
row.names = T)
# fit model with plm
pm1 <- plm(y ~ x, data = p.df2, model = "within") #this is where the
error occurs.

So is there any way I could cluster SE at the group level using plm?
Any other comments would be highly appreciated.

Thanks in advance!
Renzo
Center for Development Research
University of Bonn