[R] R: Package plm (pvcm) "within" and lm

Malcolm MISTRY malcolm.mistry at unive.it
Sat Nov 14 14:51:09 CET 2015


Dear all,
I have a doubt on a fixed-effects specification I am trying to run on my
large "balanced" panel data with approx 15,000 IDs (n), 33 years (t).

I have used R packages lfe and plm and am aware of methods in both packages
for following specifications:

(a) Individual FE (ID)
(b) Twoways FE (ID + Year)

Now the third specification I am trying to run a specification of the form

y_it = a_i + b_i*t + e_it,

i.e. a model with an individual-specific intercept and an individual-
specific slope.

To illustrate, I show using Grunfeld data, where firm is the ID and year is
year. I did this using both plm(pvcm) and nlme (lmList) and get the desired
results.

e.g. in pvcm, below form

library(plm)

data(Grunfeld, package = "Ecdat")
head(Grunfeld)

pvcm_model <- pvcm(inv~value, data=Grunfeld, model="within",
effect="individual")  # gives individual coefficients and interecepts for
each firm

Plm documentation says that pvcm with method 'within' is equivalent of
estimating a separate model for each individual. So my doubt is this:

*(1) What is the equivalent form of the above pvcm_model if one would like
to repeat the same using 'lm'?*

I tried the below specifications in lm both am unable to replicate the
individual coefficients and the intercepts of pvcm

ols_model_1 <- lm(inv ~ value * factor(firm) -1, data = Grunfeld) # gives
the same individual intercepts as pvcm, but coefficients (slopes) are
incorrect

# Using time demeaned 'y' (inv) and 'x' (value)

ols_model_2 <- lm(demeaned_inv ~ factor(firm)/demeaned_value -1,
data=Grunfeld_final) # gives the same individual coefficients (slopes), but
intercepts are now incorrect

*(2) My second question is if there is any difference in the above pvcm
specification and a specification as below (notice the 'i' is dropped from
the intercept)*

y_it = a + b_i*t + e_it

In other words, if I would like to run a time series kind of specification
on each individual ID in my panel, to give me individual specific slopes
and intercepts,
does the above pvcm specification imply the same (I am confused with the
'i' subscript because in FEs we use 'i', but in pooling we don't and so
'a_i' should imply 'a'. Moreover, I am confused why the term "within" is
used in pvcm if we are talking of individual specific slopes and intercepts
anyways...).

Hoping someone can reply soon as I am really confused with the pvcm and (2)
specification

Note that my data is large N approx 500,000 observations and its at 0.5 x
0.5 deg resolution (global lat/lon). pvcm did run on a cluster althought it
took a few hours, where as nlme

(lmList) was quite fast.

Rgds
Malcolm

	[[alternative HTML version deleted]]



More information about the R-help mailing list