-1

I have a variable in Stata which consists of letters such as A, B, AB, AC, etc.

I want to change to numeric, with values numbers instead of letters, such as 1 instead of A.

I tried to encode the variable and then recode but it does not work.

I also tried to generate a new variable with if but that also does not work.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
  • A report "does not work" is useless without showing your exact code and giving an explanation of what that means. – Nick Cox May 14 '23 at 09:26
  • Please see https://stackoverflow.com/help/minimal-reproducible-example for the standard here. – Nick Cox May 14 '23 at 09:35

1 Answers1

0

encode should work fine. Here is an example.

. * Example generated by -dataex-. For more info, type help dataex
. clear

. input str2 strvar

        strvar
  1. "A" 
  2. "B" 
  3. "AB"
  4. "AC"
  5. end

. encode strvar, gen(numvar)

. list 

     +-----------------+
     | strvar   numvar |
     |-----------------|
  1. |      A        A |
  2. |      B        B |
  3. |     AB       AB |
  4. |     AC       AC |
     +-----------------+

. label list 
numvar:
           1 A
           2 AB
           3 AC
           4 B

. list, nolabel

     +-----------------+
     | strvar   numvar |
     |-----------------|
  1. |      A        1 |
  2. |      B        4 |
  3. |     AB        2 |
  4. |     AC        3 |
     +-----------------+

encode by default maps distinct strings, ordered alphabetically, to integers 1 up.

If you don't like the default, you need to specify your own different scheme for translation. Doing that carefully should remove, or at least reduce, any need for a recode.

. label def wanted 1 "A" 2 "B" 3 "AB" 4 "AC"

. encode strvar, gen(wanted) label(wanted)  

. list 

     +--------------------------+
     | strvar   numvar   wanted |
     |--------------------------|
  1. |      A        A        A |
  2. |      B        B        B |
  3. |     AB       AB       AB |
  4. |     AC       AC       AC |
     +--------------------------+

. list, nolabel 

     +--------------------------+
     | strvar   numvar   wanted |
     |--------------------------|
  1. |      A        1        1 |
  2. |      B        4        2 |
  3. |     AB        2        3 |
  4. |     AC        3        4 |
     +--------------------------+

Nick Cox
  • 35,529
  • 6
  • 31
  • 47