I am trying to decompose the gender pay gap with the Blinder-Oaxaca decomposition for OLS regression (two-fold) using: Stata, R, and manual calculation with Excel.
All programs give me similar results; however, I believe the results are wrong. The reason is, the slope for occupation varies a lot between the female and male regressions however, it explains -0.001% of the Overall Difference. I have been using deviation coding for a 4 level factor. And also tried to simplify the variable to only 2 levels ([0]Manager; [1]Other). (Using 1 and 2 for values does not help).
My calculatation: (β:slope; X:mean for particular variable) I'm using log wages
*βm βf Xm Xf*
Occupation 0.031 0.183 0.14 0.14
Explained occupation: βm(Xm-Xf)
Unexplained occupation: (βm-βf)Xf
So obviously the problem is I am multiplying 0.031 with (0.14-0.14) which equals to 0.
So my question is: Am I handling the factor variable correctly? What sense does it make to use the mean for a dummy (or deviation coded) factor variable?
However, as stated before; Oaxaca package for Stata gives me the same result as R and Excel (manual calc). I am lost.