[R] Question regarding the discrepancy between count model parameter estimates between "pscl" and "MASS"

Nick Livingston nlivingston at ymail.com
Fri Aug 29 05:57:26 CEST 2014


I have sought consultation online and in person, to no avail. I hope someone
on here might have some insight. Any feedback would be most welcome.

I am attempting to plot predicted values from a two-component hurdle model
(logistic [suicide attempt yes/no] and negative binomial count [number of
attempts thereafter]). To do so, I estimated each component separately using
glm (MASS). While I am able to reproduce hurdle results for the logit
portion in glm, estimates for the negative binomial count component are
different.

Call:
hurdle(formula = Suicide. ~ Age + gender + Victimization * FamilySupport |
Age + gender + Victimization * FamilySupport, dist = "negbin", link =
"logit")

Pearson residuals:
    Min      1Q  Median      3Q     Max
-0.9816 -0.5187 -0.4094  0.2974  5.8820

Count model coefficients (truncated negbin with log link):
                                                Estimate Std. Error z value
Pr(>|z|)   
(Intercept)                          -0.29150    0.33127  -0.880   0.3789   
Age                                      0.17068    0.07556   2.259   0.0239
*
gender                                 0.28273    0.31614   0.894   0.3712   
Victimization                         1.08405    0.18157   5.971 2.36e-09
***
FamilySupport                      0.33629    0.29302   1.148   0.2511   
Victimization:FamilySupport -0.96831    0.46841  -2.067   0.0387 *
Log(theta)                            0.12245    0.54102   0.226   0.8209   
Zero hurdle model coefficients (binomial with logit link):
                                                 Estimate Std. Error z value
Pr(>|z|)   
(Intercept)                           -0.547051   0.215981  -2.533  0.01131
*
Age                                     -0.154493   0.063994  -2.414
0.01577 *
gender                                 -0.030942   0.284868  -0.109  0.91350     
Victimization                          1.073956   0.338015   3.177  0.00149
**
FamilySupport                       -0.380360   0.247530  -1.537  0.12439   
Victimization\:FamilySupport  -0.813329   0.399905  -2.034  0.04197 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Theta: count = 1.1303
Number of iterations in BFGS optimization: 23
Log-likelihood: -374.3 on 25 Df
> summary(logistic)




Call:
glm(formula = SuicideBinary ~ Age + gender = Victimization * FamilySupport,
family = "binomial")

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-1.9948  -0.8470  -0.6686   1.1160   2.0805

Coefficients:
                                                 Estimate Std. Error z value
Pr(>|z|)   
(Intercept)                          -0.547051   0.215981  -2.533  0.01131 *
Age                                    -0.154493   0.063994  -2.414  0.01577
*
gender                                -0.030942   0.284868  -0.109  0.91350   
Victimization                         1.073956   0.338014   3.177  0.00149
**
FamilySupport                      -0.380360   0.247530  -1.537  0.12439   
Victimization:FamilySupport  -0.813329   0.399904  -2.034  0.04197 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 452.54  on 359  degrees of freedom
Residual deviance: 408.24  on 348  degrees of freedom
  (52 observations deleted due to missingness)
AIC: 432.24

Number of Fisher Scoring iterations: 4

> summary(Count1)






Call:
glm(formula = NegBinSuicide ~ Age + gender + Victimization * FamilySupport,
family = negative.binomial(theta = 1.1303))

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-1.6393  -0.4504  -0.1679   0.2350   2.1676

Coefficients:
                                                Estimate Std. Error t value
Pr(>|t|)   
(Intercept)                            0.60820    0.13779   4.414 2.49e-05
***
Age                                      0.08836    0.04189   2.109   0.0373
*
gender                                  0.10983    0.17873   0.615   0.5402   
Victimization                          0.73270    0.10776   6.799 6.82e-10
***
FamilySupport                        0.10213    0.15979   0.639   0.5241   
Victimization:FamilySupport   -0.60146    0.24532  -2.452   0.0159 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Negative Binomial(1.1303) family taken to be
0.4549082)

    Null deviance: 76.159  on 115  degrees of freedom
Residual deviance: 35.101  on 104  degrees of freedom
  (296 observations deleted due to missingness)
AIC: 480.6

Number of Fisher Scoring iterations: 15


Alternatively, if there is a simpler way to plot hurdle regression output, or if anyone is award of another means of estimating NB models (I haven't had much luck with vglm from VGAM either), I would be happy to hear about that as well. I'm currently using the "visreg"
package for plotting.

Thanks!
 
   
 
    



More information about the R-help mailing list