[R] Heckman Selection MOdel Help in R

Arne Henningsen arne.henningsen at googlemail.com
Mon Jul 13 16:09:12 CEST 2009


On Mon, Jul 13, 2009 at 11:18 AM, Pathak,
Saurav<s.pathak08 at imperial.ac.uk> wrote:
> Dear Arne
> I have gone through the paper and I have tried it at my end, I would really appreciate if you could address the following:
>
> 1. Based upon your suggestion I used the following:
>
> regmod2 <- selection(s ~ age + gender + gemedu + gemhinc + es_gdppc +
>    imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu, adpopdata, method = "2step")
> On trying the above( notice that I have changed "heckit" to "selection" in the above command, i get the following error message
>
> Error in coef.probit(result$probit) :
>  could not find function "coef.maxLik"

That's weird. Which versions of R, sampleSelection, and maxLik do you use?

> Before trying the above I tried the following:
>
> 2. When I tried to do the Heckman selection model in stages , the first run was successful, I mean, using the following:
>
> myProbit<- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc +
> +     imf_pop + estbbo_m, family = binomial(link = "probit"))
>> summary(myProbit)
>
> I am successful upto this point, but
>
> 3. When I try calculating the IMR using the following:
> adpopdata$IMR<-invMillsRatio(myProbit)$IMR1
>
> I get the error below
> Error in `$<-.data.frame`(`*tmp*`, "IMR", value = c(2.50039945424535,  :
>  replacement has 257358 rows, data has 343251

I guess that you have some NAs in the data so that you have the IMRs
not for all observations but only for the observations witout NAs.

R> myIMRs <- invMillsRatio(myProbit)$IMR1
should work.

> Is there a code to calculate IMR by hand??

Yes, inside invMillsRatio()
However, why do you want to do this?

> what I see is that the number of rows of IMR calculated and the number
> of rows in the actual data set do not match (may be some missing
> value issues, I am not sure, if it is, how to fix it?) and hence IMR could
> not be added to my original data set, how do I fix this and then proceed
> to get correct IMR to use in my outcome equation  (the OLS stage)
>
> This is really taking a lot of time, I am working on it for weeks, can
> you please help me kindly, If you wish I can send you the data set as well

Please try to fix it yourself.

Arne

>
> -----Original Message-----
> From: Arne Henningsen [mailto:arne.henningsen at googlemail.com]
> Sent: 13 July 2009 00:56
> To: Pathak, Saurav; r-help at r-project.org; otoomet at ut.ee
> Subject: Re: Heckman Selection MOdel Help in R
>
> Hi Saurav!
>
> On Sun, Jul 12, 2009 at 6:06 PM, Pathak,
> Saurav<s.pathak08 at imperial.ac.uk> wrote:
>> I am new to R, I have to do a 2 step Heckman model, my selection equation is
>> below which I was successful in running but I am unable to proceed further,
>>
>>
>>
>> I have so far used the following command
>>
>> glm(formula = s ~ age + gender + gemedu + gemhinc + es_gdppc +
>>     imf_pop + estbbo_m, family = binomial(link = "probit"))
>>
>> My question is
>> 1. How do i discard the non significant selection variables (one out of the
>> seven variables above is non-significant) and calculate the Inverse Mills
>> Ratio of the significant variables
>>
>> 2. I need the inverse mills ratio from the above to run the outcome equation
>> model using OLS with the Inverse mills ratio calculated on the basis of the
>> above probit as the control in my outcome equation,  hence I need to get the
>> IMR (Is there another direct way?)
>>
>> 3. How can this be done in R using my concept or otherwise does there exist
>> another way of doing what I wish to achieve
>>
>>
>>
>> On trying
>>
>> regmod <- heckit(s ~ age + gender + gemedu + gemhinc + es_gdppc +
>>
>>     imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu,
>> adpopdata,method="2step")
>>
>>
>>
>> I get
>>
>> Error: could not find function "heckit"
>>
>>
>>
>> Error: could not find function "invMillsRatio"
>>
>>
>>
>> Am I missing out something, do i have to install something apart from R
>> also, so far I have used
>>
>>
>>
>> install.packages( "sampleSelection", repos="http://R-Forge.R-project.org" )
>>
>> install.packages("Rcmdr", dependencies=TRUE)
>>
>>
>>
>> Even then I am unable to run heckit, please help
>
> You have to install (only once) and *load* the package before you can use it:
> R> library( "sampleSelection" )
>
> I suggest that you do NOT use function "heckit" but function
> "selection", because the latter allows you to estimate the model both
> by the 2-step and the 1-step (ML) method (depending on argument
> "method").
>
> Our paper about the sampleSelection package published in the JSS could
> be also helpful for you:
> http://www.jstatsoft.org/v27/i07/
>
> Arne
>
> --
> Arne Henningsen
> http://www.arne-henningsen.name
>



-- 
Arne Henningsen
http://www.arne-henningsen.name




More information about the R-help mailing list