im currently using stata, and i dont know what code to use in particular. can someone help me solve this issue, the problem at hand is as follows :

Question

We are interested in studying the impact of fertility on employment outcomes of adult men and women. We will start by constructing the main outcome variables of interest. Construct the following variables: (a) A dummy variable indicating if the observation ever worked. (b) A dummy variable indicating if the observation is employed, and another one indicating if they are in the labor force. (c) A variable measuring the number of years they have been working for their current employer.

i tried using an egen code but requests were not as expected

See https://meta.stackoverflow.com/questions/334822/how-do-i-ask-and-answer-homework-questions for guidance. — Nick Cox, Apr 02 '23 at 09:51

score 0 · Answer 1 · answered Apr 13 '23 at 10:42

Nick is correctly noting that you will get better answers if you provide more information about your situation. Lacking that, we have to guess about your data structure.

(a) A dummy variable indicating if the observation ever worked.

if we assume a variable working_now equal to 1 if the individual is working now and 0 otherwise, a time period variable year and an individual identifier id all arranged in an id-year panel, then we can simply:

egen ever_worked = max(working_now), by(id)

(b) A dummy variable indicating if the observation is employed

You'll need to provide more information about your dataset (even sharing a little data with ssc dataex) for this one.

and another one indicating if they are in the labor force.

this depends on your definition of 'in the labor force'. In New Hampshire for example it means the following:

Persons "in the labor force" are those in the civilian noninstitutional population, age sixteen years or older, who are employed or who are unemployed and seeking employment.

This might not be relevant to your case, but it's enough for this example.

Then let's assume an age variable with integer ages, an institutionalized variable and a seeking_work variable each equal to 1 if true and 0 otherwise. Then we can do the following:

generate in_labor_force = (age >= 16 \\\
          & institutionalized==0 \\\
          & (seeking_work==1 | working_now==1) \\\
          if !mi(age)

(c) A variable measuring the number of years they have been working for their current employer.

for this we will also need an employer id eid:

\\ indicator for the beginning of each spell
xtset id year
gen start = (eid != l.eid) if working_now==1

\\ an id for the spells
gen spell_id = sum(start) if working_now==1

\\ a var to count
gen count_this = 1
\\ counts the length of spells
bysort spell (year): gen spell_length = sum(count) if working_now==1

Nick has written several nice articles on how to do this, none of which I'm doing justice to here. For example: Speaking Stata: Identifying spells.

im currently using stata, and i dont know what code to use in particular. can someone help me solve this issue, the problem at hand is as follows :

1 Answers1