I m looking for a better way to open all xlsx and csv files of a repository at a time and then merge them. Here is what I have done, but as you will see there are some manual actions that I m trying to automate. Can you please help me?
Define the Working directory
setwd("~/Desktop/repo/") list.files() [1] "dataset1.csv" [2] "dataset2.csv" [3] "dataset3.csv" [4] "dataset4.csv" [5] "export.xlsx" [6] "export1.xlsx" [7] "export2.xlsx"
Open all CSV files (this part seems OK to me, as each dataset opens automatically)
file.list <- list.files(pattern='*.csv') csv_df <- lapply(file.list, read.csv, header=TRUE) csv_df <- do.call("rbind", csv_df) #open all csv in a single dataframe
Open all XLSX files (here comes the most ugly manually part)
df1 <- lapply(excel_sheets("export.xlsx"), read_excel, path = "export.xlsx") df2 <- lapply(excel_sheets("export1.xlsx"), read_excel, path = "export(1).xlsx") df3 <- lapply(excel_sheets("export2.xlsx"), read_excel, path = "export(2).xlsx") df1 <- do.call("rbind", df1) df2 <- do.call("rbind", df2) df3 <- do.call("rbind", df3) xlsx_df <- rbind(df1, df2, df3) #merge all xlsx
Merge everything in a big dataset
big_df <- rbind(csv_df, xlsx_df)
Any suggestions ? How can I improve this script (specially the 3. part)?