1

I have this variable which takes on these values:

     tab expenditure
   
                            Q11 |      Freq.     Percent        Cum.
--------------------------------+-----------------------------------
                  Afs 2500-5000 |         24        3.84        3.84
                  Afs 5000-7500 |         89       14.24       18.08
                 Afs 7500-10000 |        235       37.60       55.68
I don't know / refuse to answer |          9        1.44       57.12
             Less than Afs 2500 |          5        0.80       57.92
            More than Afs 10000 |        263       42.08      100.00
--------------------------------+-----------------------------------
                          Total |        625      100.00

I would like to change the ordering, so the categories are not in alphabetical order. I tried using

label define expenditure 1 "Less than Afs 2500" 2 "Afs 2500-5000" 3 "Afs 5000-7500" 4 "Afs 7500-10000" 5 "More than Afs 10000" 6 "I don't know / refuse to answer", replace

I also tried using

recode expenditure (1 = 5) (2 = 1) (3 = 2) (4 = 3) (5 = 6) (6 = 4)

However, both methods just change the labels, not the underlying data, and now the data is all messed up (note the changes in frequencies, now only 24 observations for the "More than Afs 10000" category instead of 263 as before).

tab expenditure

                            Q11 |      Freq.     Percent        Cum.
--------------------------------+-----------------------------------
             Less than Afs 2500 |         89       14.24       14.24
                  Afs 2500-5000 |        235       37.60       51.84
                  Afs 5000-7500 |          9        1.44       53.28
                 Afs 7500-10000 |        263       42.08       95.36
            More than Afs 10000 |         24        3.84       99.20
I don't know / refuse to answer |          5        0.80      100.00
--------------------------------+-----------------------------------
                          Total |        625      100.00

What's going on? What can I do to change this without affecting my underlying data?

Joanna
  • 27
  • 4

2 Answers2

1

If your expenditure variable were string, you could just use label define and encode like this:

. clear 

. input str31 expenditure int freq 

                         expenditure      freq
  1.                   "Afs 2500-5000"        24        
  2.                   "Afs 5000-7500"        89      
  3.                  "Afs 7500-10000"       235      
  4. "I don't know / refuse to answer"         9        
  5.              "Less than Afs 2500"         5        
  6.             "More than Afs 10000"       263   
  7. end 

. label def expenditure 1 "Less than Afs 2500" 2 "Afs 2500-5000" 3 "Afs 5000-7500" 4 "Afs 7500-10000" 5 "More than Afs 10000" 6 "I don't know / refuse to answer"


. encode expenditure, gen(expenditure2) label(expenditure)


. label var expenditure2 "expenditure"


. tab expenditure2 [fw=freq]

                    expenditure |      Freq.     Percent        Cum.
--------------------------------+-----------------------------------
             Less than Afs 2500 |          5        0.80        0.80
                  Afs 2500-5000 |         24        3.84        4.64
                  Afs 5000-7500 |         89       14.24       18.88
                 Afs 7500-10000 |        235       37.60       56.48
            More than Afs 10000 |        263       42.08       98.56
I don't know / refuse to answer |          9        1.44      100.00
--------------------------------+-----------------------------------
                          Total |        625      100.00

It seems, however, that your variable is numeric, so it is as if you did this:

. clear 

. input byte expenditure int freq 

     expend~e      freq
  1. 1                       24        
  2. 2                       89      
  3. 3                      235      
  4. 4                        9        
  5. 5                        5        
  6. 6                      263   
  7. end 

. label def expenditure 5 "Less than Afs 2500" 1 "Afs 2500-5000" 2 "Afs 5000-7500" 3 "Afs 7500-10000" 6 "More than Afs 10000" 4 "I don't know / refuse to answer"

. label val expenditure expenditure 

The problem now is that you need to re-define the value labels as well as apply recode.

. recode expenditure 5=1 1=2 2=3 3=4 4=6 6=5 5=6 
(6 changes made to expenditure)


. tab expenditure [fw=freq]

                    expenditure |      Freq.     Percent        Cum.
--------------------------------+-----------------------------------
                  Afs 2500-5000 |          5        0.80        0.80
                  Afs 5000-7500 |         24        3.84        4.64
                 Afs 7500-10000 |         89       14.24       18.88
I don't know / refuse to answer |        235       37.60       56.48
             Less than Afs 2500 |        263       42.08       98.56
            More than Afs 10000 |          9        1.44      100.00
--------------------------------+-----------------------------------
                          Total |        625      100.00

. label def expenditure 1 "Less than Afs 2500" 2 "Afs 2500-5000" 3 "Afs 5000-7500" 4 "Afs 7500-10000" 5 "More than Afs 10000" 6 "I don't know / refuse to answer", modify


. tab expenditure [fw=freq]

                    expenditure |      Freq.     Percent        Cum.
--------------------------------+-----------------------------------
             Less than Afs 2500 |          5        0.80        0.80
                  Afs 2500-5000 |         24        3.84        4.64
                  Afs 5000-7500 |         89       14.24       18.88
                 Afs 7500-10000 |        235       37.60       56.48
            More than Afs 10000 |        263       42.08       98.56
I don't know / refuse to answer |          9        1.44      100.00
--------------------------------+-----------------------------------
                          Total |        625      100.00
Nick Cox
  • 35,529
  • 6
  • 31
  • 47
  • Thanks! Yeah originally the variable was string (let's call it expenses), and I used encode expenses, gen(expenditure). Then I used label def expenditure 1 "Less than Afs 2500" 2 "Afs 2500-5000" 3 "Afs 5000-7500" 4 "Afs 7500-10000" 5 "More than Afs 10000" 6 "I don't know / refuse to answer", but with replace at the end instead of modify. I think maybe this was the issue. – Joanna Sep 22 '21 at 15:50
  • Good, but it's important not to leave ambiguity about whether such variables are string or numeric. The command `dataex` was written for that kind of purpose. Further, note that you don't need to show the individual data. It is sufficient and simpler to have group frequencies as a variable. – Nick Cox Sep 22 '21 at 16:03
0

I figured it out.

label define order2 1 "Less than Afs 2500" 2 "Afs 2500-5000" 3 "Afs 5000-7500" 4 "Afs 7500-10000" 5 "More than Afs 10000" 6 "I don't know / refuse to answer"

encode expenditure, gen(expenditure2) label(order2)

does the trick without altering the data.

Joanna
  • 27
  • 4