3

I have a set of 18 Stata data files (one per year) whose names are:

{aindresp.dta, bindresp.dta, ... , rindresp.dta}

I want to eliminate some variables from each dataset. For this, I want to use the fact that many variables across dataset have the same name, plus a prefix given by the dataset prefix (a, b, c, ... r). For example, the variable rach12 is called arach12 in dataset aindresp.dta, etc. Thus, to clean each dataset, I run a loop like the following:

clear all
local list a b c d e f g h i j k l m n o p q r
foreach var of local list {
use `var'indresp.dta
drop `var'rach12 `var'jbchc1 `var'jbchc2 `var'jbchc3 `var'xpchcf var'xpchc
save `var'indresp.dta, replace
}

The actual loop is much larger. I am deleting around 200 variables.

The problem is that some variables change name over time, or disappear after a few years. Other variables are added. Therefore, the loop stops as soon as a variable is not found. This is because the drop command in Stata stops. Yet, that command has no option to force it to continue.

How can I achieve my goal? I would not like to go manually over each dataset.

luchonacho
  • 6,759
  • 4
  • 35
  • 52

1 Answers1

5

help capture

You can just put capture in front of the drop. You can just keep going, but a little better would be to flag which datasets fail.

In this sample code, I've presumed that there is no point to the save, replace if you didn't drop anything. The basic idea is that a failure of a command results in a non-zero error code accessible in _rc. This will be positive (true) if there was a failure and zero (false) otherwise.

A more elaborate procedure would be to loop over the variables concerned and flag specific variables not found.

clear all
local list a b c d e f g h i j k l m n o p q r
foreach var of local list {
    use `var'indresp.dta
    capture drop `var'rach12 `var'jbchc1 `var'jbchc2 `var'jbchc3 `var'xpchcf var'xpchc
    if _rc { 
        noisily di "Note: failure for `var'indresp.data" 
    } 
    else save `var'indresp.dta, replace
}

See also Does Stata have any `try and catch` mechanism similar to Java?

EDIT:

If you want to drop whatever exists, then this should suffice for your problem.

clear all
local list a b c d e f g h i j k l m n o p q r
foreach var of local list {

    use `var'indresp.dta
    capture drop `var'rach12 `var'jbchc1 `var'jbchc2 `var'jbchc3 `var'xpchcf var'xpchc
    if _rc { 
        di "Note: problem for `var'indresp.data" 
        checkdrop `var'rach12 `var'jbchc1 `var'jbchc2 `var'jbchc3   
    } 

    save `var'indresp.dta, replace
}

where checkdrop is something like

*! 1.0.0 NJC 1 April 2016 
program checkdrop
    version 8.2 

    foreach v of local 0 { 
        capture confirm var `v' 
        if _rc == 0 { 
            local droplist `droplist' `v'  
        } 
        else local badlist `badlist' `v'  
    } 

    if "`badlist'" != "" {
        di _n "{p}{txt}variables not found: {res}`badlist'{p_end}" 
    } 

    if "`droplist'" != "" { 
        drop `droplist' 
    } 
end 
Community
  • 1
  • 1
Nick Cox
  • 35,529
  • 6
  • 31
  • 47
  • 1
    Mmm, using `capture` continues the iteration if a command fails (as the title of the post). As in my previous title, my goal is for the `drop` command to continue dropping even if one variable is not found. Alternatively, I have to write a drop for each single variable, as you suggest. Is there not a user-written command that drop all found variables within a command line? – luchonacho Apr 01 '16 at 10:25
  • I didn't see in your question exactly what you wanted to do if there was a problem. The title is just "continue loop". But if the answer is "`drop` whatever exists" then that is an easy program. I'll edit above. – Nick Cox Apr 01 '16 at 10:41
  • That's great! I'm not very knowledgeable on Stata codes, but is your program deleting variables **temporarily** within each loop? – luchonacho Apr 01 '16 at 11:33
  • Not at all. There is no such thing in Stata any way, other than reading the dataset back in again? Do you want to `drop` the variables or not? There is no middle way. – Nick Cox Apr 01 '16 at 11:47
  • What I mean is if the code above will check the same original subset of variables to delete every iteration. Say that variables to delete are `a c d e f g`. If `b` is not in the first iteration dataset, will the second iteration also look for `b`? That is what I mean for a temporary subset within every loop. Perhaps I'm not understanding you code. Is it not looking for all those variables within the drop command that are present in the respective iteration dataset and saving it as `droplist`? – luchonacho Apr 01 '16 at 15:16
  • Yes and no. `droplist` is not saved; it just contains the names of the variables that are dropped. `checkdrop` doesn't remember what it did last time. To understand `checkdrop` follow `sysuse auto, clear` with `checkdrop mpg weight frog toad` after installing it as `checkdrop.ado` in what `adopath` calls `PLUS`. – Nick Cox Apr 01 '16 at 15:52