[R] Problems with Panel Data estimation

JBrettas jcosta at marketdata.com.br
Wed Jan 18 14:14:34 CET 2012


Hi everybody,

Got some doubts here. I'm kinda desperate for help, so please ask me if
anything isn't clear.

I have a database with this structure (panel data structure):

> head(dados_2)
  Tempo Safra   Data   Resposta Perc_Resg_Acum Alta_Temporada Flexi Promo
1     1     1 200701 0.04223216              0              1     0     0
2     1     2 200702 0.02801536              0             -1     0     0
3     1     3 200703 0.02786171              0              0     0     0
4     1     4 200704 0.02913633              0              0     0     0
5     1     5 200705 0.03953217              0              0     0     0
6     1     6 200706 0.05084010              0              0     0     0
  Promo_Ponto_Frio Parceiros
1                0         0
2                0         0
3                0         0
4                0         0
5                0         0
6                0         0
> 


where I have 25 levels of "Tempo" and 34 for "Safra".

I want to obtain the confidence intervals of the regression coefficients,
and also forecast the "Resposta" variable with prediction intervals.

But then, I've got some problems here:
-When "Tempo" = 1 (the time index), the variable "Perc_Resg_Acum" gets 0.
-I have some databases of the same kind (panel data structure) and some of
then does not have any value on the variable "Promo" in the entire column.

I'm modeling with the funcions pvcm() and lmList() (which are equivalent),
but then, instead of giving 0 as coefficient for variable "promo", the
function removes the entire column of the model and calculates the
estimations. How can I do to consider the columns of zeros on the regression
model and return a null coefficient instead of NA?

To help with my doubts, here is part of my code:

model_within1 <-
pvcm(Resposta~Perc_Resg_Acum+Alta_Temporada+Flexi+Promo+Promo_Ponto_Frio+Parceiros,
data = dados_2, model="within")

model_within2 <-
lmList(Resposta~Perc_Resg_Acum+Alta_Temporada+Flexi+Promo+Promo_Ponto_Frio+Parceiros|Tempo,
data = dados_2)


When I run the first model, I get this:


> model_within1 <-
> pvcm(Resposta~Perc_Resg_Acum+Alta_Temporada+Flexi+Promo+Promo_Ponto_Frio+Parceiros,
> data = dados_2, model="within")
*serie Promo_Ponto_Frio  is constant and has been removed
Error in eval(expr, envir, enclos) : 
  object 'Promo_Ponto_Frio' not found*> 

(Well, I don't want to remove the constant column and then proceed using it)


With the lmList function, I get no error message, but this outputs:

>model_within2

> model_within2
Call:
  Model: Resposta ~ Perc_Resg_Acum + Alta_Temporada + Flexi + Promo +
Promo_Ponto_Frio + Parceiros | Tempo 
   Data: dados_2 

Coefficients:
   (Intercept) Perc_Resg_Acum Alta_Temporada         Flexi         Promo
1   0.05575606             NA   0.0094899066            NA            NA
2   0.02767265    0.602756910   0.0097098374            NA            NA
3   0.01493001    0.216359571   0.0072083199            NA            NA
4   0.01702644    0.130147260   0.0073664874            NA            NA
5   0.02199162    0.077860221   0.0072053502            NA            NA
6   0.02574624    0.049635548   0.0062181048            NA            NA
7   0.02672193    0.035288194   0.0064811866            NA            NA
8   0.03620546    0.001001478   0.0056185695  0.0117215422            NA
9   0.03834693   -0.007266645   0.0060674586  0.0107572932            NA
10  0.03851210   -0.011103720   0.0059166792  0.0099111959            NA
11  0.03877860   -0.011788541   0.0052854680  0.0085353595            NA
12  0.04213484   -0.017576921   0.0049738941  0.0084101158            NA
13  0.04217531   -0.017615294   0.0057095338  0.0094572949            NA
14  0.04591170   -0.027457745   0.0052304802  0.0091305518            NA
15  0.05575347   -0.047174244   0.0043227892  0.0070854218            NA
16  0.06751835   -0.068053502   0.0041113756  0.0035637901            NA
17  0.06743575   -0.066419074   0.0035628714  0.0027136668            NA
18  0.08494492   -0.092279778   0.0027045102  0.0025033917  0.0056991774
19  0.10540605   -0.122592396   0.0034576115  0.0007773916  0.0043499539
20  0.09374987   -0.102578612   0.0022937536  0.0003246327 -0.0033912104
21  0.09477620   -0.103937511   0.0018046064 -0.0019254150  0.0038385416
22  0.07984309   -0.081880920   0.0031004004  0.0012212949 -0.0001274436
23  0.04209354   -0.027693308   0.0033759713  0.0012759561 -0.0005342256
24  0.02248439    0.001793878   0.0019126335  0.0016330109 -0.0012942994
25 -0.04124712    0.093798787  -0.0009151255  0.0026952764  0.0002002742
   Promo_Ponto_Frio     Parceiros
1                NA  0.0085825438
2                NA -0.0040152859
3                NA -0.0015317053
4                NA -0.0016866579
5                NA -0.0014183949
6                NA -0.0016753846
7                NA -0.0012411159
8                NA -0.0016690746
9                NA -0.0018987163
10               NA -0.0016922052
11               NA -0.0017404386
12               NA -0.0017259225
13               NA -0.0014849246
14               NA -0.0014719829
15               NA -0.0016265977
16               NA -0.0015527121
17               NA -0.0014492467
18               NA -0.0016308425
19               NA -0.0013443498
20               NA -0.0012088912
21               NA -0.0006974880
22               NA -0.0006981946
23               NA -0.0006599528
24               NA -0.0004592202
25               NA -0.0020974059

And, because this NAs, when I run summary(model_within2),  I've got only
estimations, std.errors and quantiles of t of 3 variables. Is there a way to
solve this problem? A way to consider also the constant columns on my model?


Help me, please!!!

--
View this message in context: http://r.789695.n4.nabble.com/Problems-with-Panel-Data-estimation-tp4306602p4306602.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list