-4

I am trying to generate a dummy variable that = 1 if at least two or more (out of seven) dummy variables also == 1. Could anybody tell me an efficient way of doing this?

Nick Cox
  • 35,529
  • 6
  • 31
  • 47

1 Answers1

1

Let's suppose that the indicator variables concerned (you say "dummy variables", but that's a terminology over-used given its disadvantages) are x1 ... x7. From that definition it is taken that their values are 1 or 0, except that values may also be missing. Then the logic for the summary you want is

gen xs = (x1 + x2 + x3 + x4 + x5 + x6 + x7) >= 2 if (x1 + x2 + x3 + x4 + x5 + x6 + x7) < . 

That's not too difficult to type, given copy and paste to replicate the syntax for the sum. The if qualifier segregates any observations with missing on any of the indicators, for which missing will be returned for the new variable. Such observations will be reported as having a total x1 + x2 + x3 + x4 + x5 + x6 + x7 that is missing. Missing is treated as arbitrarily large in Stata, and certainly as greater than 2, which explains why the simpler code

gen xs = (x1 + x2 + x3 + x4 + x5 + x6 + x7) >= 2 

would bite you if missings were present.

If you want a more complicated rule, you may find yourself reaching for egen functions rowtotal(), rowmiss(), and so forth. See the help for egen.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
  • The solution I used was gen variable1=0 gen variable1=1 if x1 + x2 + x3 == 1 where == 2 if I am interested in the variable that has 2, >=3 if greater than or = to 3, an so on. My apologies for the novice question. – Econometrics33 Jun 29 '15 at 20:26
  • Sorry, but your comment is not clear to me. The code wouldn't run as you post it. If the answer is not what you want, please edit the original question. – Nick Cox Jun 29 '15 at 20:28