1

My aim is to clean a given local from _ and all numbers following the underscore at the end of the words. Assume that I have underscores followed by numbers at the end of the words only.

By using subinstr(), I am able to specify that I want to eliminate _1 (and possibly loop over different numbers), but the double-loop syntax seems to be overly complicated for such task:

local list_x `" "rep78_3" "make_1" "price_1" "mpg_2" "'
local n_x : list sizeof list_x

forvalues j = 1/`n_x' {
    local varname: word `j' of `list_x'
    local clean_name: subinstr local varname "_1" "" 
    display "`clean_name'" 
}

I tried to look into regexm() and regexs(), but I am not quite sure how to set up the code.

I understand there might be multiple ways to solve this.

Maybe there is a simpler way to address the issue that I cannot see?

Stefano Lombardi
  • 1,581
  • 2
  • 22
  • 48

4 Answers4

4

With the new version of regex functions in Stata 14, you can replace all matches at once.

. local list_x `" "rep78_3" "make_1" "price_1" "mpg_2" "'

. local fixed = ustrregexra(`"`list_x'"', "_[0-9]+","")

. dis `"`fixed'"'
 "rep78" "make" "price" "mpg" 
Robert Picard
  • 1,051
  • 6
  • 9
1

Using string functions:

local list_x rep78_3 make_1 price_1 mpg_2

// assumes only one _
foreach elem of local list_x {
    local pos = strpos("`elem'", "_")
    local clean = substr("`elem'", 1, `pos' - 1) 
    di "`clean'" 
}

// considers last _ (there can be multiple)
foreach elem of local list_x {
    local pos = strpos(reverse("`elem'"), "_")
    local clean = reverse(substr(reverse("`elem'"), `pos' + 1, .))
    di "`clean'" 
}

You can nest function calls if that is your taste. See help string functions.

Regular expressions should also work.

Roberto Ferrer
  • 11,024
  • 1
  • 21
  • 23
0

Using regular expressions, a solution is:

local list_x `" "rep78_3" "make_1" "price_1" "mpg_2" "'
local n_x : list sizeof list_x

forval j = 1/`n_x' {
    local varname: word `j' of `list_x'
    local clean_name = regexr("`varname'" , "_[0-9]$" , "")
    di "`clean_name'" 
}
Stefano Lombardi
  • 1,581
  • 2
  • 22
  • 48
0

You can do the same thing by combining the subinstr() function and the confirm command:

local list_x rep78_3 make_1 price_1 mpg_2

local new_list_x = subinstr("`list_x'", "_", " ", .)

foreach x of local new_list_x {
    capture confirm number `x'
    if _rc != 0 {
        local final_list_x `final_list_x' `x'
    }
}

display "`final_list_x'"
rep78 make price mpg