0

I have a folder with multiple .dta files and I'm using the read_dta() function of the haven library to bind them. The problem is that some of the files have thier column names in lower case and others have them in upper case.

I was wondering if there is a way to only read the specific columns by changing their name to lower case in every case without reading the whole file and then selecting the columns, since the files are really large and this would take forever.

I was hoping that by using the .name_repair = element in the read_dta() function I could do this, but I really don't know how.

Im trying something like this

#Set working directory:
setwd("T:/")

#List of .dta file names to bind:
list_names<-list_names[grepl("_sdem.dta", list_names)]

#Variable names to select form those files:
vars_select<-c("r_def", "c_res", "ur", "con", "n_hog", "v_sel", "n_pro_viv","fac", "n_ren", "upm","eda", "clase1", "clase2", "clase3", "ent", "sex", "e_con", "niv_ins", "eda7c", "tpg_p8a","emp_ppal", "tue_ppal", "sub_o" )


#Read and bind ONLY the selected variables form the list of files
dataset <- data.frame()

for (i in 1:length(list_names)){
  temp_data <- read_dta(list_names[i], col_select = vars_select) 
  dataset <- rbind(dataset, temp_data) 
}

The problem is that when some of the files have their variable names in upper case format, their variables are not in the vars_select list and therefore, the next error appears:

Error: Can't subset columns that don't exist.
x Columns `r_def`, `c_res`, `n_hog`, `v_sel`, `n_pro_viv`, etc. don't exist.

I was trying to use the .name_repair = element in the read_dta() function to try to correct this, by using the tolower() function.

I was trying something like this with a specific file that has an upper case variable name format:

example_data <- read_dta("T:/2017_2_sdem.dta", col_select = vars_select, .name_repair = tolower(names())) 

But the same error appears:

Error: Can't subset columns that don't exist.
x Columns `r_def`, `c_res`, `n_hog`, `v_sel`, `n_pro_viv`, etc. don't exist.

Thanks so much for your help!

georgehj
  • 13
  • 3

0 Answers0