encode
should work fine. Here is an example.
. * Example generated by -dataex-. For more info, type help dataex
. clear
. input str2 strvar
strvar
1. "A"
2. "B"
3. "AB"
4. "AC"
5. end
. encode strvar, gen(numvar)
. list
+-----------------+
| strvar numvar |
|-----------------|
1. | A A |
2. | B B |
3. | AB AB |
4. | AC AC |
+-----------------+
. label list
numvar:
1 A
2 AB
3 AC
4 B
. list, nolabel
+-----------------+
| strvar numvar |
|-----------------|
1. | A 1 |
2. | B 4 |
3. | AB 2 |
4. | AC 3 |
+-----------------+
encode
by default maps distinct strings, ordered alphabetically, to integers 1 up.
If you don't like the default, you need to specify your own different scheme for translation. Doing that carefully should remove, or at least reduce, any need for a recode
.
. label def wanted 1 "A" 2 "B" 3 "AB" 4 "AC"
. encode strvar, gen(wanted) label(wanted)
. list
+--------------------------+
| strvar numvar wanted |
|--------------------------|
1. | A A A |
2. | B B B |
3. | AB AB AB |
4. | AC AC AC |
+--------------------------+
. list, nolabel
+--------------------------+
| strvar numvar wanted |
|--------------------------|
1. | A 1 1 |
2. | B 4 2 |
3. | AB 2 3 |
4. | AC 3 4 |
+--------------------------+