[R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives

Sun Aug 2 16:49:40 CEST 2009

Hi,

Providing the gradient function is generally a good idea in optimization; however, it is not necessary.  Almost all optimization routines will compute this using a simple finite-difference approximation, if they are not user-specified. If your function is very complicated, then you are more likely to make a mistake in computing analytic gradient, although many optimization routines also provide a check to see if the gradient is correctly specified or not.  But you can do this yourself using the `grad' function in "numDeriv" package.

Hope this is helpful,
Ravi.

____________________________________________________________________

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan at jhmi.edu

----- Original Message -----
From: nikolay12 <nikolay12 at gmail.com>
Date: Sunday, August 2, 2009 3:04 am
Subject: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives
To: r-help at r-project.org

>  Hi,
>  
>  I would like to apply the L-BFGS optimization algorithm to compute 
> the MLE
>  of a multilevel multinomial Logistic Regression. 
>  
>  The likelihood formula for this model has as one of the summands the 
> formula
>  for computing the likelihood of an ordinary (single-level) 
> multinomial logit
>  regression. So I would basically need the R implementation for this formula.
>  The L-BFGS algorithm also requires computing the partial derivatives 
> of that
>  formula in respect to all parameters. I would appreciate if you can 
> point me
>  to existing implementations that can do the above.
>  
>  Nick
>  
>  PS. The long story for the above:
>  
>  My data is as follows: 
>  
>  - a vector of observed values (lenght = D) of the dependent multinomial
>  variable each element belonging to one of N levels of that variable
>  
>  - a matrix of corresponding observed values (O x P) of the independent
>  variables (P in total, most of them are binary but also a few are
>  integer-valued)
>  
>  - a vector of current estimates (or starting values) for the Beta
>  coefficients of the independent variables (length = P).
>  
>  This data is available for 4 different pools. The partially-pooled model
>  that I want to compute has as a likelihood function a sum of several
>  elements, one being the classical likelihood function of a 
> multinomial logit
>  regression for each of the 4 pools.
>  
>  This is the same model as in Finkel and Manning "Hierarchical Bayesian
>  Domain Adaptation" (2009).
>  
>  -- 
>  View this message in context: 
>  Sent from the R help mailing list archive at Nabble.com.
>  
>  ______________________________________________
>  R-help at r-project.org mailing list
>  
>  PLEASE do read the posting guide 
>  and provide commented, minimal, self-contained, reproducible code.