1

I have a group of Excel files with multiple sheets which don’t follow a standard naming convention. I want to create a single data frame from specific sheets containing the keyword 'frame'.

library(tidyverse)
library(openxlsx)

# Sample Excel File 1
df1 <- data.frame(replicate(10,sample(0:1,10,rep=TRUE)))
data_frame2 <- data.frame(replicate(10,sample(0:1,10,rep=TRUE)))
list_of_datasets1 <- list("df" = df1, "date_frame" = data_frame2)
write.xlsx(list_of_datasets1, file = "writeXLSX1.xlsx")

# Sample Excel File 2
df3 <- data.frame(replicate(10,sample(0:1,10,rep=TRUE)))
data_frame4 <- data.frame(replicate(10,sample(0:1,10,rep=TRUE)))
list_of_datasets2 <- list("date_frames" = df3, "dfs" = data_frame4)
write.xlsx(list_of_datasets2, file = "writeXLSX2.xlsx")

# Create List of Excel Files
excel_file_list <- list.files(pattern = "writeXLSX\\d*.xlsx", full.names = T)

I'd like to be able to do this using a regex with purr like this:

df_bind <- excel_file_list %>%
  map_dfr(~read_excel(.x, sheet = grepl("frame", .x)))

The closest answer I found works fine with a single file. However, I can't quite figure out how to extract the sheet names correctly when they're in a list.

rsylatian
  • 429
  • 2
  • 14
  • Have you tried 'readxl::excel_sheets' to extract the sheet names to use directly in the map function? – Peter Apr 19 '20 at 17:34
  • Yeah, but again wasn't sure on how to exit the loop error when it requested a path string. – rsylatian Apr 19 '20 at 17:38

1 Answers1

2

We can use str_detect

library(readxl)
library(dplyr)
library(purrr)
excel_file_list %>% 
      map_dfr(~ read_excel(.x, sheet = which(str_detect(excel_sheets(.x), 'frame'))))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Great answer, exactly what I'm looking for. Can you just change the `map` function to `map_dfr` and I'll accept it. – rsylatian Apr 19 '20 at 17:47