[R] Probit regression with limited parameter space

Ben Bolker bbolker at gmail.com
Thu Feb 2 21:13:41 CET 2012


  [cc'ing back to r-help again -- I do this so the answers can be
archived and viewed by others]

On 12-02-02 02:41 PM, Sally Luo wrote:
> Prof. Bolker,
> 
> Thanks for your quick reply and detailed explanation.
> 
> I also ran the unrestricted model using glmfit <-
> glm(y~x1+x2+x3+x4+x5+x6+x7+x8, family=binomial(link="probit"),data=d).
> However, the results I got from glm and mle2 (both for the unrestricted
> model) are not very similar (please see below).  In your earlier example,
> both glm and mle2 produce almost the same estimation results.  I just hope
> to figure out what might cause the discrepancy in the estimation results
> I've got.
> 
> 
> coef(summary(*glmfit*))
>                 Estimate   Std. Error    z value     Pr(>|z|)
> (Intercept) -0.853900059 0.2464179864 -3.4652505 5.297377e-04
> x1             1.627125691 0.3076174699  5.2894450 1.226881e-07
> x2            -0.092716326 0.5229866504 -0.1772824 8.592866e-01
> x3            -3.301509522 0.9169991843 -3.6003407 3.178004e-04
> x4             7.187483436 2.2135961171  3.2469715 1.166401e-03
> x5            -0.002544181 0.0112740324 -0.2256673 8.214602e-01
> x6             6.978374268 2.2347939216  3.1226030 1.792594e-03
> x7            -0.009832379 0.0113807583 -0.8639476 3.876167e-01
> x8            -0.001252075 0.0002304789 -5.4324941 5.557178e-08
> 
>> coef(summary(*mle2fit*))
>                         Estimate  Std. Error     z value        Pr(z)
> pprobit.(Intercept) -0.603492668 0.230117071  -2.6225463 8.727541e-03
> pprobit.x1             1.645984346 0.288479906   5.7057158 1.158552e-08
> pprobit.x2            -0.157361533 0.523048376  -0.3008546 7.635253e-01
> pprobit.x3            -3.935203692 0.932692587  -4.2191862 2.451857e-05
> pprobit.x4           7.512701611 0.062911076 119.4177885 0.000000e+00
> pprobit.x5            -0.001475556 0.011525137  -0.1280293 8.981258e-01
> pprobit.x6           7.399355063 0.018372749 402.7353318 0.000000e+00
> pprobit.x7            -0.010113008 0.011647725  -0.8682388 3.852636e-01
> pprobit.x8            -0.001650021 0.000244997  -6.7348622 1.640854e-11

  My best guess is that you are running into optimization problems.

The big advantage of glm() is that it uses a special-purpose
optimization method (iteratively reweighted least squares) that is
generally much more robust/reliable than general-purpose nonlinear
optimizers such as nlminb.  If there is indeed a GLM fitting routine
coded in R, somewhere, that someone has adapted to work with box
constraints, it will probably perform better than mle2.

Some general suggestions for troubleshooting this:

 * check the log-likelihoods returned by the two methods.  If they are
very close (say within 0.01 likelihood units), then the issue is that
you just have a very flat goodness-of-fit surface, and the two sets of
coefficients are in practice very similar to each other.

 * if possible, try starting each approach (glm(), mle2()) from the
solution found by the other (it's a little bit of a pain to get the
syntax right here) and see if they get stuck right where they are or
whether they find that one answer or the other is right.

 * if you were using one of the optimizing methods from optim() (rather
than nlminb), e.g. L-BFGS-B, I would suggest you try using parscale to
rescale the parameters to have approximately equal magnitudes near the
solution.  This apparently isn't possible with nlminb, but you could try
optimizer="optim" (the default), method="L-BFGS-B" and see how you do
(although L-BFGS-B is often a bit finicky).  Alternatively, you can try
optimizer="optimx", in which case you have a larger variety of
unconstrained optimizers to choose from (you have to install the optimx
package and take a look at its documentation).  Alternatively, you can
scale your input variables (e.g. use scale() on your input matrix to get
zero-centered, sd 1 variables), although you would then have to adjust
your lower and upper bounds accordingly.

 * it's a bit more work, but you may be able to unpack this a bit and
provide analytical derivatives.  That would help a lot.

  In short: you are entering the quagmire of numerical optimization methods.

   I have learned most of this stuff by trial and error -- can anyone on
the list suggest a good/friendly introduction?  (Press et al Numerical
Recipes; Givens and Hoeting's Computational Statistics book looks good,
although I haven't read it ...)

  Ben Bolker


> 
> 
> 
> On Thu, Feb 2, 2012 at 1:12 PM, Ben Bolker <bbolker at gmail.com> wrote:
> 
>>
>>  [cc'ing back to r-help]
>>
>> On 12-02-02 01:56 PM, Sally Luo wrote:
>>
>>> I tried to adapt your code to my model and got the results as below.  I
>>> don't know how to fix the warning messages. It says "rearrange the lower
>>> (or upper) bounds to match 'start'".
>>
>>  The warning is overly conservative in this case.  I should work on
>> engineering the package so that it handles this better. You can
>> disregard them.
>>
>>  In answer to your previous questions:
>>
>>  * "size" refers to the number of trials per observation (1, if you have
>> binary data)
>>  * you've got the form of the lower and upper bounds right.
>>  * you've got the formula in 'parameters' right -- this builds a linear
>> model (using R's model.matrix) on the probit scale based on the 8
>> parameters
>>>
>>> And two of the estimates for my restricted parameters are on the
>> boundary.
>>> The warning message says the variance-covariance calculations may be
>>> unreliable.  Those parameters are the ones of interest to my study.  Can
>> I
>>> still make inferences using the p-values reported by mle2 in this case?
>>
>>  That's quite tricky unfortunately, and it isn't a problem that's
>> specific to the mle2 package.  The basic issue is that the whole
>> derivation of the multivariate normal sampling distribution of the
>> maximum likelihood estimator depends on the maximum likelihood being an
>> interior local maximum (and hence having a negative-definite hessian, or
>> a positive-definite information matrix), which is untrue on the boundary
>> -- the Wikipedia article on maximum likelihood mentions this issue, for
>> example http://en.wikipedia.org/wiki/Maximum_likelihood
>>
>>  Perhaps someone here can suggest an approach (although it gets outside
>> the scope of "R help", or you can ask on http://stats.stackexchange.com...
>>
>>
>>>
>>> Thanks for your help.  ---- Sally
>>>
>>>
>>> mle2fit <- mle2(y~dbinom(pnorm(pprobit),size=1),
>>> +             parameters=list(pprobit~x1+x2+x3+x4+x5+x6+x7+x8),
>>> +             start=list(pprobit=0),
>>> +             optimizer="nlminb",
>>> +             lower=c(-Inf,-1,-1,-1,-Inf,-Inf,-Inf,-Inf,-Inf),
>>> +             upper=c(Inf,1,1,1,Inf,Inf,Inf,Inf,Inf),
>>> +             data=d)
>>>
>>> Warning messages:
>>> 1: In fix_order(call$lower, "lower bounds", -Inf) :
>>>   lower bounds not named: rearranging to match 'start'
>>> 2: In fix_order(call$upper, "upper bounds", Inf) :
>>>   upper bounds not named: rearranging to match 'start'
>>> 3: In mle2(y ~ dbinom(pnorm(pprobit), size = 1), parameters =
>> list(pprobit
>>> ~  :
>>>   some parameters are on the boundary: variance-covariance calculations
>>> based on Hessian may be unreliable
>>
>>>
>>>
>>> On Wed, Feb 1, 2012 at 11:16 PM, Sally Luo <shali623 at gmail.com> wrote:
>>>
>>>> Prof. Bolker,
>>>>
>>>> Thanks a lot for your reply.
>>>>
>>>> In my model, I have 9 explanatory variables and I need to restrict the
>>>> range of parameters 2-4 to (-1,1).  I tried to modify the univariate
>> probit
>>>> example you gave in your reply, however, I could not get through.
>>>>
>>>> Specificially, I am not sure what 'pprobit' represents in your code? How
>>>> should I code this part if I have more than one variable?
>>>>
>>>> Also does "size" refer to the number of parameters?
>>>>
>>>> Since only 3 parameters need to be restricted in my model, should I
>> write
>>>> lower=c(-Inf, -1,-1,-1, -Inf, -Inf, -Inf, -Inf, -Inf) and upper=c(Inf,
>>>> 1,1,1, Inf, Inf, Inf, Inf, Inf)?
>>>>
>>>> Thanks again for your kind help.
>>>>
>>>> Best,
>>>>
>>>> Sally
>>>>
>>>>
>>>>
>>>> On Wed, Feb 1, 2012 at 7:19 AM, Ben Bolker <bbolker at gmail.com> wrote:
>>>>
>>>>>  Sally Luo <shali623 <at> gmail.com> writes:
>>>>>
>>>>>>
>>>>>> Dear R helpers,
>>>>>>
>>>>>> I need to estimate a probit model with box constraints placed on
>>>>> several of
>>>>>> the model parameters.  I have the following two questions:
>>>>>>
>>>>>> 1) How are the standard errors calclulated in glm
>>>>>> (family=binomial(link="probit")?  I ran a typical probit model using
>> the
>>>>>> glm probit link and the nlminb function with my own coding of the
>>>>>> loglikehood, separately. As nlminb does not produce the hessian
>> matrix,
>>>>> I
>>>>>> used hessian (numDeriv) to calculate it.  However, the standard errors
>>>>>> calculated using hessian function are quite different from the ones
>>>>>> generated by the glm function, although the parameter estimates are
>> very
>>>>>> close.  I was wondering what makes this difference in the estmation of
>>>>>> standard errors and how this computation is carried out in glm.
>>>>>>
>>>>>> 2) Does any one know how to estimate a constrained probit model in R
>>>>> (to be
>>>>>> specific, I need to restrain the range of three parameters to [-1,1])?
>>>>>> Among the optimation functions, so far nlminb and spg work for my
>>>>> problem,
>>>>>> but neither produces a hessian matrix.  As I mentioned above, if I use
>>>>>> hessian funciton and calculate standard errors manually, the standard
>>>>>> errors seem not right.
>>>>>>
>>>>>
>>>>>   I'm a little biased, but I think the bbmle package is the
>>>>> easiest way to get this done -- it provides convenient wrappers
>>>>> for a range of optimizers including nlminb.
>>>>>   I would warn however that you should be very careful interpreting
>>>>> the meaning of the Hessian matrix if some of your parameters lie
>>>>> on the boundary of the feasible space ...
>>>>>
>>>>> set.seed(101)
>>>>> x <- runif(100)
>>>>> p <- pnorm(1+3*x)
>>>>> y <- rbinom(100,p,size=1)
>>>>> d <- data.frame(x,y)
>>>>> glmfit <- glm(y~x,family=binomial(link="probit"),data=d)
>>>>> coef(summary(glmfit))
>>>>>
>>>>> library(bbmle)
>>>>> mle2fit <- mle2(y~dbinom(pnorm(pprobit),size=1),
>>>>>             parameters=list(pprobit~x),
>>>>>             start=list(pprobit=0),
>>>>>             data=d)
>>>>> coef(summary(mle2fit))
>>>>>
>>>>> ## now add constraints
>>>>> mle2fit_constr <- mle2(y~dbinom(pnorm(pprobit),size=1),
>>>>>             parameters=list(pprobit~x),
>>>>>             start=list(pprobit=0),
>>>>>             optimizer="nlminb",
>>>>>             lower=c(2,0.15),
>>>>>             data=d)
>>>>>
>>>>> coef(summary(mle2fit_constr))
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> <http://www.r-project.org/posting-guide.html>
>>  >>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>>
>>
>>
>



More information about the R-help mailing list