[R] coefficients poolability (was: question regarding panel data analysis)

Millo Giovanni Giovanni_Millo at Generali.com
Thu Jul 1 17:07:37 CEST 2010


Hello.

Not an easy question at all, and it has little to do with software,
alas! 

Veeeeeery loosely speaking: if the homogeneity hypothesis is rejected,
then, depending on data availability, you may still be able to treat the
data like a panel by:
a) ignoring the results of the poolability test
b) allowing the coefficients to vary. 
Of course, a) requires some courage while b) requires more degrees of
freedom etc.. Some authoritative commentators (Baltagi) stress the
advantages of imposing even an uncertain homogeneity hypothesis over
resorting to heterogeneous techniques with uncertain small-sample
properties (especially if data are not in the thousands...) on grounds
of efficiency. Others (Pesaran) support the opposite strategy on grounds
of consistency. You might start your inquiry from Baltagi, Griffin and
Xiong, "To pool or not to pool", The Review of Economics and Statistics,
February 2000, 82(1): 117-126. 

In a nutshell, you must strike a balance between efficiency  and
consistency of the estimators, all in the light of the power properties
of the pooling test. Your choice will depend on your goal (coefficient
interpretation rests on consistency, prediction will emphasize model fit
and stability etc.), on how many and how noisy the data are etc. It also
depends on how "strongly" poolability is rejected: 0.049 or <2.2 e-16?
Moreover, if you have 20.000 data points, most hypotheses end up to be
rejected (see Leamer, 1978 on this) but you can also afford to estimate
N*(K+1) parameters. On the converse, on 3x30 data points I wouldn't even
run the poolability test on parameters, but only on intercepts... The
"problematic" dimensions in the light of the efficiency/consistency
tradeoff might be like Baltagi et al.'s (30 years x 47 states).

This was just very loose talk to give you an idea of the issues
involved: I strongly suggest you check out the literature. Turning back
from philosophy to software, available methods for panels with
heterogeneous slopes are the Swamy estimator in pvcm{plm} and the mixed
models' methods in packages nlme and lme4.

PS if you want to play with the Baltagi et al. data,

> data(Cigar, package="Ecdat")
> fm <- log(sales)~lag(log(sales),1)+log(price)+log(pimin)+log(ndi)

...and so on. (As you can see, the pooltest badly rejects).

HTH,
Giovanni

************************************
Message: 108
Date: Thu, 1 Jul 2010 02:12:20 +0200
From: amatoallah ouchen <at.ouchen at gmail.com>
To: r-help at r-project.org
Subject: [R] question regarding panel data analysis
Message-ID:
	<AANLkTimxIo6ZLz0lwlMx5robawt9HeAjeAam-h14z-W7 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Good day R-users,

So if the question may seem easy to many of you but this present a
serious  issue for me . I'm currently running a panel data analysis
i've used the plm package to perform the Tests of poolability as
results intercepts and coefficients are assumed different. so my
question is should give up the panel analysis in my case or is there
any alternative methodology or transformation i can use instead??

Any hint would be highly appreciated

thanks a lot in advance.

Ama
************************************

Giovanni Millo
Research Dept.,
Assicurazioni Generali SpA
Via Machiavelli 4, 
34132 Trieste (Italy)
tel. +39 040 671184 
fax  +39 040 671160 


Ai sensi del D.Lgs. 196/2003 si precisa che le informazi...{{dropped:13}}



More information about the R-help mailing list