-1

Say I have a categorical variable, for example a country column in a table.
How can I quickly add dummy variables for each category--WITH A RELEVANT NAME?

So if the column is for country, the variable for whether the person lives in the USA would be called USA not country16 or something.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
Dan
  • 63
  • 2
  • 6
  • Watch out: country names with spaces won''t be legal variable names. `"United States"` would be one such. – Nick Cox Nov 11 '17 at 10:02

1 Answers1

0

This is pretty easy:

/* Make some fake data */
sysuse auto, clear
gen make_only = subinstr(lower(word(make,1)),".","",.)

/* Create meaningful dummies */
levelsof make_only, clean local(makes)
foreach m of local makes {
    gen `m' = cond(make_only=="`m'",1,0)
}

However, it is probably easier to just use factor variables notation:

sencode make_only, label(make_only) replace
reg price i.make_only
list make price if make_only=="amc":make_only

Regression output will be nicely labeled, you don't create extra variables, and it's easy enough to refer to particular values.


sencode is written by Roger Newson and is available from SSC.

dimitriy
  • 9,077
  • 2
  • 25
  • 50