1

I want to use Heckman selection model for panel data. I googled and it seems that gllamm in Stata is able to do that.

However, I could not find a proper tutorial of how to use it. I try to follow the one by Sophia Rabe-Hesketh but could not understand those steps.

I currently restrict myself to cross-sectional data. It should be equivalent to the built-in heckman command in Stata. That is,

use http://www.stata-press.com/data/r13/womenwk
gen gotowork=1
replace gotowork=0 if wage==.
heckman wage educ age, select(gotowork=married children educ age)

However, it is hard for me to mapping these variables to the gllamm tutorial. Specifically, in Slide 10, there is y1 and y2 in the heckman command. But there is only one y in gllamm. If gotowork is y1 and wage is y2, how to define this y variable? Should it be wage?

And when I try to implement the following step,

reshape long y, i(id) j(var)

as

reshape long wage, i(id) j(var)

I got an error saying

variable var contains all missing values

Why?

Currently I work around this problem by doing the following step

tab gotowork, gen(i)

And I got another error for the estimation step

gen married_i1 = married*i1
gen children_i1 = children*i1
gen educ_i1 = educ*i1
gen age_i1 = age*i1

gen wage_i2 = wage*i2
gen educ_i2 = educa*i2
gen age_i2 = age*i2

eq load: i1 i2 
constraint define 1 [id1_1]i1 = 1 
gllamm wage married_i1 children_i1 educ_i1 age_i1 i1 wage_i2 educ_i2 age_i2 i2, i(id) eqs(load) nocons constr(1)

error message:

initial values not feasible
(error occurred in ML computation)
(use trace option and check correctness of initial model)

Can anyone help me explain these errors and how to use gllamm for Heckman selection model correctly?

My ultimate goal is to implement panel Heckman selection model. Is there any other STATA (or R) package able to do this?

Thanks.

JanLauGe
  • 2,297
  • 2
  • 16
  • 40
Ding Li
  • 673
  • 1
  • 7
  • 19
  • 1
    Did you take a look at the bottom of slide 12? The STATA code there tells you that when you `reshape` your data to make it long, `var = 1` when `y = y1` and `var = 2` when `y = y2`. Assuming `y2` is `wage` and `y1` is `gotowork`, your `y` column should end up holding the *actual values* of both `wage` and `gotowork`, whilst your `var` column will hold a `1` if the value in that row is `gotowork`, or a `2` if the value in that row is `wage`. – meenaparam Jul 14 '17 at 09:47
  • The link is to a single presentation. An entire website www.gllamm.org has many resources and lists several others, including books. – Nick Cox Feb 28 '18 at 21:35
  • Since there does not seem to be what is marked as an answer here, I will still go ahead to refer you to some material that allows you to estimate Heckman or sample selection models in panel settings. Refer to the paper by Semykina & Wooldridge (2010) available here https://www.sciencedirect.com/science/article/pii/S0304407610000825. The code for running the model is available at Semykina's faculty page at ( http://myweb.fsu.edu/asemykina/ ) under the same paper title, or more specifically the two-step procedure is accessible here http://myweb.fsu.edu/asemykina/two_step_se_parametric.do. – user3180480 Feb 25 '19 at 09:06

1 Answers1

0

This will work if you rename wage y1 and gotowork y2. The reshape function from wide to long will create the new variable var.