[R] Coding for segmented regression within a hurdle model

Sat Mar 15 19:35:34 CET 2014

On Sat, 15 Mar 2014, Tim Marcella wrote:

> Hi,
>
> I am using a two part hurdle model to account for zero inflation and
> overdispersion in my count data. I would like to account for a segmented or
> breakpoint relationship in the binomial logistic hurdle model and pass
> these results onto the count model (negative binomial).
>
> Using the segemented package I have determined that my data supports one
> breakpoint at 3.85. The slope to this point is significant and will affect
> the presence of zeros in a linear fashion. The slope > 3.85 is
> non-significant and estimated to not help predict the presence of zeros in
> the data (threshold effect). Here are the results from this model
>
> Estimated Break-Point(s):
>   Est. St.Err
> 3.853  1.372
>
> t value for the gap-variable(s) V:  0
>
> Meaningful coefficients of the linear terms:
>               Estimate Std. Error z value Pr(>|z|)
> (Intercept)     -0.2750     0.3556  -0.774   0.4392
> approach_km     -0.4520     0.2184  -2.069   0.0385 *
> sea2             0.3627     0.2280   1.591   0.1117
> U1.approach_km   0.4543     0.2188   2.076       NA
>
> U1.approach_km is the estimate for the second slope. The actual estimated
> slope for the section section is the difference between this value and the
> approach_km value (0.0023).
>
> I think that I have found a way to "maually" code this into the hurdle
> model as follows
>
> hurdle.fit <- hurdle(tot_f ~  x1 + x2 + x3 | approach_km +
> I(pmax(approach_km-3.849,0)) + sea )
>
> When I look at the estimated coefficients from the "manual" code it gives
> the same values. However, the std.errors are estimated lower.
>
> Zero hurdle model coefficients (binomial with logit link):
>                                Estimate Std. Error z value Pr(>|z|)
> (Intercept)                     -0.27441    0.29347  -0.935    0.350
> approach_km                     -0.45261    0.09993  -4.529 5.92e-06 ***
> I(pmax(approach_km - 3.849, 0))  0.45486    0.10723   4.242 2.22e-05 ***
> sea2                             0.36271    0.22803   1.591    0.112
>
> Question # 1: Does the hurdle equation use the standard errors from the
> zero model when building the count predictions?

No, both parts of the model can be estimated completely independently.

Best,
Z

> If no then I guess I would
> not have to worry about this and can just report the original std.errors
> and associated p values from the segemented object in the pub.
> Question # 2: If the count model uses the std.errors, how can I reformulate
> this equation to generate the original std.errors.
>
> Thanks, Tim
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>