8

I am working with a set of dta files representing surveys from different years.

Conveniently, each year uses different values for the country variable, so I am trying to set the country value labels for each year to match. I am having trouble comparing value labels though.

So far, I have come up with the following code:

replace country=1 if countryO=="Japan"
replace country=2 if countryO=="South Korea" | countryO=="Korea"
replace country=3 if countryO=="China"
replace country=4 if countryO=="Malaysia"

However, this doesn't work because "Japan" is the value label, not the actual value.

How do I tell Stata that I am comparing the value label?

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
jamzsabb
  • 1,125
  • 2
  • 18
  • 40
  • 3
    Strictly, the `if` here is the `if` qualifier; the `if` command is different. Note that if you look in the index to the User's Guide, there are just two entries on value labels, and 13.10 contains the detail you need. Compare my earlier remarks on Googling when you have the documentation right there. – Nick Cox Mar 31 '14 at 17:53
  • This Stata User's Guide is great, I was quickly able to figure out what I wasn't understanding, which is that value labels are stored in their own object and then mapped to a variable. Thanks for suggesting this, the Stata help pages came back in Google results but not this manual. – jamzsabb Mar 31 '14 at 21:38
  • 2
    @PearlySpencer improved your question by cutting distracting chatter. Good questions here are intensely technical. Please don’t feel irritated by his edit, but accept it with good grace as coming from a more experienced member. – Nick Cox Apr 02 '19 at 21:22

3 Answers3

14

Try

replace country=1 if countryO=="Japan":country0valuelabel
replace country=2 if inlist(countryO,"South Korea":country0valuelabel,"Korea":country0valuelabel)

You will have to replace country0valuelabel with the corresponding value label name in your data. You can find out its name by looking at the penultimate column in the output of describe country0.

dimitriy
  • 9,077
  • 2
  • 25
  • 50
  • Precisely what I needed, thanks for the suggestion. The describe countryO command helped me put it all together, thanks for the help – jamzsabb Mar 31 '14 at 21:41
2

To complement @Dimitriy's answer:

clear all
set more off

sysuse auto
keep foreign weight

describe foreign
label list origin

replace weight = . if foreign == 0

list in 1/15
list in 1/15, nolabel 

describe displays the value label associated with a variable. label list can show the content of a particular value label.

Roberto Ferrer
  • 11,024
  • 1
  • 21
  • 23
0

I know I'm responding to this post years later, but I wanted to provide a solution that will work for multiple variables in case anybody comes across this.

My task was similar, except that I had to recode every variable that had a "Refused" response as a numerical value (8, 9, 99, etc) to the missing value type (., .r, .b, etc). All the variables had "Refused" coded a different value based on the value label, e.g. some variables had "Refused" coded as 9, while others had it as 99, or 8.

Version Information Stata 15.1

Code

    foreach v of varlist * {        
        if `"`: val label `v''"' == "yndkr" {
          recode `v' (9 = .r)
        }

        else if `"`: val label `v''"' == "bw3" {
          recode `v' (9 = .r)
        }

        else if `"`: val label `v''"' == "def_some" {
          recode `v' (9 = .r)
        }

        else if `"`: val label `v''"' == "difficulty5" {
          recode `v' (9 = .r)
        }               
    }

You can keep adding as many else if commands as needed. I only showed a chunk of my entire loop, but I hope this demonstrates what needs to be done. If you need to find the name of your value labels, use the command labelbook and it will print them all for you.

  • 1
    Your answer should either use the jargon found in the question or provide a self-contained example with data (as in Roberto Ferrer's answer). As it stands it is not very informative to anyone except you perhaps. –  Apr 01 '19 at 13:29
  • For the underlying problem here, it can be hard not to write messy, _ad hoc_ code when the problem is messy. But for categorical variables that were coded inconsistently, apart from raging quietly at whoever did that, I would consider a `decode` of them all, defining a set of value labels in one place and then an `encode` of them all. `multencode` from SSC is one specific tool here. – Nick Cox Apr 01 '19 at 13:49
  • In your example code, the innermost statement is identical for the four cases covered. `inlist()` would give you a way to group them into one. – Nick Cox Apr 01 '19 at 13:52
  • Thank you both for the information on best practice to providing possible answers here - this was my first contribution and I will use what you said. I made the correction in the edit - and you are correct. I decided to go this route, and post this route, as it should have more flexibility if I have to revisit this mess and make further edits. – Corey Bryant Apr 01 '19 at 17:02