8

I would like to calculate a Heckman selection model manually in R. My problem is that the standard errors are biased. Is there a way to correct these manually as well?

Below my (sample) code from the sampleSelection model (correct SEs), and the manual code (correct Estimates, wrong SEs)

 require(sampleSelection)

data( Mroz87 )
Mroz87$kids <- ( Mroz87$kids5 + Mroz87$kids618 > 0 )

Using sampleSelection

heckman <- selection(selection = lfp ~ age + I(age^2) + faminc + kids + educ, outcome = wage ~ exper + I(exper^2) + educ + city, 
                                data = Mroz87, method = "2step")
summary(heckman)

Manually

seleqn1 <- glm(lfp ~ age + I(age^2) + faminc + kids + educ, family=binomial(link="probit"), data=Mroz87)
summary(seleqn1)

# Calculate inverse Mills ratio by hand ##
Mroz87$IMR <- dnorm(seleqn1$linear.predictors)/pnorm(seleqn1$linear.predictors)

# Outcome equation correcting for selection ## ==> correct estimates, wrong SEs
outeqn1 <- lm(wage ~ exper + I(exper^2) + educ + city + IMR, data=Mroz87, subset=(lfp==1))
summary(outeqn1)
research111
  • 347
  • 5
  • 18

1 Answers1

2
myprobit    <- probit(lfp ~ age + I(age^2) + faminc + kids + educ - 1, x = TRUE, 
                           iterlim = 30, data=Mroz87)

imrData     <- invMillsRatio(myprobit) # same as yours in this particular case
Mroz87$IMR1 <- imrData$IMR1

outeqn1     <- lm(wage ~ -1 + exper + I(exper^2) + educ + city + IMR1, 
                  data=Mroz87, subset=(lfp==1))

The main thing was that you use intercept models instead of no-intercept.

Hack-R
  • 22,422
  • 14
  • 75
  • 131
  • Thanks, but now I seem to get not only different standard errors but also different estimates..? – research111 Aug 22 '16 at 09:14
  • @research111 I will look into the SE issue more. I got the same results as the package but I used `heckit2fit()` as the basis of comparison. – Hack-R Aug 22 '16 at 12:52
  • @research111 If nothing else perhaps we could contact the package author for clarification and send him a link to this thread. otoomet@ut.ee – Hack-R Aug 22 '16 at 12:56
  • thanks, in that case I also get the same results. It seems necessary to correct the standard errors because it's a 2 step procedure. I found code in Stata [here](http://www.stata.com/statalist/archive/2010-02/msg00308.html) but not yet in R – research111 Aug 22 '16 at 21:54