[R] Discrete choice model maximum likelihood estimation

infinitehorizon barisvardar at hotmail.com
Mon May 14 13:21:57 CEST 2012


Hello again,

I changed the name to tt. 
and for a and tt actually I was getting them from data, I didn't put them
here in the question. Now I restructured my code and below I copy the full
code, I tried many things but still getting the same error, I don't
understand where is the mistake.

I also defined one more variable to increase comprehension of the problem.
Instead of data, I define three representative vectors in the beginning. If
you run this code, you will see the error message.

# Variables, a: discrete choice variable-dependent, x and tt are independent
variables, tt is binary
a  <- c(1,1,2,3,2,3,1,2,2,3,1,1)
x  <- c(23,26,12,27,10,30,13,20,23,44,17,15)
tt <- c(1,0,0,1,1,1,0,0,1,0,1,1)

# First, just to see, the linear model

LM	<-lm(a ~ x+tt)
coefLM	<- coefficients(LM)
summary(LM)

# Probabilities for discrete choices for a=3, a=2 and a=1 respectively 
P3 <- function(bx,b3,b,tt) { 
P <- exp(bx*x+b3+b*(tt==1))/(1-exp(bx*x+b3+b*(tt==1))) 
return(P) 
} 
P2 <- function(bx,b2,b,tt) { 
P <- exp(bx*x+b2+b*(tt==1))/(1-exp(bx*x+b2+b*(tt==1))) 
return(P) 
} 
P1 <- function(bx,b1,b,tt) { 
P <- exp(bx*x+b1+b*(tt==1))/(1-exp(bx*x+b1+b*(tt==1))) 
return(P) 
}

# Likelihood functions for discrete choices for a=3, a=2 and a=1
respectively

L3 <- function(bx,b1,b2,b3,b,tt) { 
P11 <- P1(bx,b1,b,tt) 
P22 <- P2(bx,b2,b,tt) 
P33 <- P3(bx,b3,b,tt) 

L3l <- P11*P22*P33 
return(L3l) 
} 

L2 <- function(bx,b1,b2,b,tt) { 
P11 <- P1(bx,b1,b,tt) 
P22 <- P2(bx,b2,b,tt) 

L2l <- P11*P22 
return(L2l) 
} 

L1 <- function(bx,b1,b,tt) { 
P11 <- P1(bx,b1,b,tt) 

L1l <- P11 
return(L1l) 
}

# Log-likelihood function 

llfn <- function(param) { 

bx <- param[1] 
b1 <- param[2] 
b2 <- param[3] 
b3 <- param[4] 
b <- param[5] 

lL1 <- log(L1(bx,b1,b2,b,tt)) 
lL2 <- log(L2(bx,b1,b2,b3,b,tt)) 
lL3 <- log(L3(bx,b1,b2,b3,b,tt)) 

llfn <- (a==1)*lL1+(a==2)*lL2+(a==3)*lL3 
} 
start.par <- c(0,0,0,0,0) 
est <- optim(param=start.par,llfn,x=x, a=a, tt=tt, method =
c("CG"),control=list(trace=2,maxit=2000), hessian=TRUE)


Thank you very much,

Best,

Marc







Rui Barradas wrote
> 
> Ok, I forgot to say that 't' is also an R function, the matrix transpose.
> Sorry, but after 'par' I thought (in my mind) I had said it when in fact I
> even talked about 't'!
> Use 'tt'.
> If 'tt' is a vector you must first define it, in your code it doesn't
> exist. That's why R searches for and finds an object that does exist,
> function t().
> 
> You must also create 'a' beforehand. Maybe it makes sense to assign the
> choice and then call optim.
> 
> In functions L2 and L1 you don't use the values of, resp., P33 and P22.
> The calls to P3 and P2 are a waste of time.
> 
> And, as a final note, optim minimizes it's objective function, so the
> standard trick is to minimize the symmetric of the log like.
> In your case, pass -llfn to optim.
> 
> Rui Barradas
> 
> infinitehorizon wrote
>> 
>> Hello Rui,
>> 
>> First of all, thanks a lot!
>> 1. I changed par to param,
>> 2. t is a variable too, a binary one, b is the parameter associated to
>> it,
>> 
>> 4. Yes, this is where I am stuck actually.
>> 
>> I fixed the code for likelihood functions as follows, but still getting
>> the same error:
>> 
>> L3	<- function(b1,b2,b3,b,t) {
>> P11	<- P1(b1,b,t)
>> P22	<- P2(b2,b,t)
>> P33	<- P3(b3,b,t)
>> 
>> L3l	<- P11*P22*P33
>> return(L3l)
>> }
>> 
>> L2	<- function(b1,b2,b3,b,t) {
>> P11	<- P1(b1,b,t)
>> P22	<- P2(b2,b,t)
>> P33	<- P3(b3,b,t)
>> 
>> L2l	<- P11*P22
>> return(L2l)
>> }
>> 
>> L1	<- function(b1,b2,b,t) {
>> P11	<- P1(b1,b,t)
>> P22	<- P2(b2,b,t)
>> 
>> L1l	<- P11
>> return(L1l)
>> }
>> 
>> # Log-likelihood function
>> 
>> llfn	<- function(param,a,t) {
>> 
>> b1	<- param[1]
>> b2	<- param[2]
>> b3	<- param[3]
>> b	<- param[4]
>> 
>> lL1	<- log(L1(b1,b2,b,t))
>> lL2	<- log(L2(b1,b2,b3,b,t))
>> lL3	<- log(L3(b1,b2,b3,b,t))
>> 
>> llfn	<- (a==1)*lL1+(a==2)*lL2+(a==3)*lL3
>> }
>> start.par <- c(1,1,1,1)
>> est	<- optim(param=start.par,llfn, method =
>> c("CG"),control=list(trace=2,maxit=2000), hessian=TRUE)
>> 
>> In the end, i receive the same error message, 
>> actually, first I tried without start.par and I got the error of "object
>> param not found", then I defined start.par, it lead me to same error
>> "cannot coerce type 'closure' to vector of type 'double'".
>> 
>> I am aware that my questions might be too basic, but as I said I am not
>> familiar with syntax :/
>> Thanks for your help! 
>> 
>> Best,
>> 
>> Marc
>> 
>> 
>> 
>> Rui Barradas wrote
>>> 
>>> Hello,
>>> 
>>> There are several issues with your code.
>>> 
>>> 1. The error message. Don't use 'par' as a variable name, it's already
>>> an R function, tyo get or set graphics parameters.
>>> Call it something else, say, 'param'.
>>> This is what causes the error. You must pass initial values to optim,
>>> but the variable you're passing doesn't exist, you haven't created it so
>>> R finds an object with that name, the graphics parameters function.
>>> Avoid the confusion.
>>> And create 'param' with as many values as expected by llfn before the
>>> call.
>>> 
>>> 2. 't' is also a parameter. Take it out of llfn formals and put it in
>>> 'param'. Then, inside llfn's body,
>>> 
>>> t <- param[5]
>>> 
>>> 3. It still won't work. llfn will not be passed a value for 'a', for the
>>> same reason it can't find 't'.
>>> 
>>> 4. Then, look at L3 and the others. The line just before return.
>>> 
>>> 	L3l <- (P11=1)*(P22=1)*(P33=1)
>>> 
>>> After computing P11, etc, you're discarding those values and assigning 1
>>> to each of them.
>>> Your likelihood functions just became constants...
>>> And if this is a typo, if you meant P11 == 1, etc, it's even worse. You
>>> can't expect that ratios of exponentials to be equal to that one real
>>> value.
>>> 
>>> Points 1-3 are workable but this last one means you have to revise your
>>> likelihood.
>>> Good luck.
>>> 
>>> Hope this helps,
>>> 
>>> Rui Barradas
>>> 
>>> infinitehorizon wrote
>>>> 
>>>> Hello,
>>>> 
>>>> I am new to R and I am trying to estimate a discrete model with three
>>>> choices. I am stuck at a point and cannot find a solution.
>>>> 
>>>> I have probability functions for occurrence of these choices, and then
>>>> I build the likelihood functions associated to these choices and
>>>> finally I build the general log-likelihood function.
>>>> 
>>>> There are four parameters in the model, three of them are associated to
>>>> three discrete choices I mentioned, and one of them is for a binary
>>>> variable in the data (t).  There are also latent variables but I didn't
>>>> put them in this question because if I figure out how to do this, I
>>>> will be able to add them as well.
>>>> 
>>>> I am not familiar with the syntax I have to write in the likelihood
>>>> functions, so I really doubt that they are true.  Below I simplify the
>>>> problem and provide the code I've written:
>>>> 
>>>> # Probabilities for discrete choices for a=3, a=2 and a=1 respectively
>>>> P3	<- function(b3,b,t) {
>>>> P 	<- exp(b3+b*(t==1))/(1-exp(b3+b*(t==1)))
>>>> return(P)		
>>>> }
>>>> P2	<- function(b2,b,t) {
>>>> P 	<- exp(b2+b*(t==1))/(1-exp(b2+b*(t==1)))
>>>> return(P)		
>>>> }
>>>> P1	<- function(b1,b,t) {
>>>> P 	<- exp(b1+b*(t==1))/(1-exp(b1+b*(t==1)))
>>>> return(P)		
>>>> }
>>>> 
>>>> # Likelihood functions for discrete choices for a=3, a=2 and a=1
>>>> respectively
>>>> 
>>>> L3	<- function(b1,b2,b3,b,t) {
>>>> P11	<- P1(b1,b,t)
>>>> P22	<- P2(b2,b,t)
>>>> P33	<- P3(b3,b,t)
>>>> 
>>>> L3l	<- (P11=1)*(P22=1)*(P33=1)
>>>> return(L3l)
>>>> }
>>>> 
>>>> L2	<- function(b1,b2,b3,b,t) {
>>>> P11	<- P1(b1,b,t)
>>>> P22	<- P2(b2,b,t)
>>>> P33	<- P3(b3,b,t)
>>>> 
>>>> L2l	<- (P11=1)*(P22=1)*(P33=0)
>>>> return(L2l)
>>>> }
>>>> 
>>>> L1	<- function(b1,b2,b,t) {
>>>> P11	<- P1(b1,b,t)
>>>> P22	<- P2(b2,b,t)
>>>> 
>>>> L1l	<- (P11=1)*(P22=0)
>>>> return(L1l)
>>>> }
>>>> 
>>>> # Log-likelihood function
>>>> 
>>>> llfn	<- function(par,a,t) {
>>>> 
>>>> b1	<- par[1]
>>>> b2	<- par[2]
>>>> b3	<- par[3]
>>>> b	<- par[4]
>>>> 
>>>> lL1	<- log(L1(b1,b2,b,t))
>>>> lL2	<- log(L2(b1,b2,b3,b,t))
>>>> lL3	<- log(L3(b1,b2,b3,b,t))
>>>> 
>>>> llfn	<- (a==1)*lL1+(a==2)*lL2+(a==3)*lL3
>>>> }
>>>> est	<- optim(par,llfn, method =
>>>> c("CG"),control=list(trace=2,maxit=2000), hessian=TRUE)
>>>> 
>>>> And when I run this code I get "cannot coerce type 'closure' to vector
>>>> of type 'double'" error.
>>>> I will really appreciate your help. Thanks,
>>>> 
>>> 
>> 
> 

--
View this message in context: http://r.789695.n4.nabble.com/Discrete-choice-model-maximum-likelihood-estimation-tp4629877p4629910.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list