Read_excel having trouble with filepaths

Question

I have a list of file paths saved in a character called pwb_files_e1. Some of the pathways have apostrophes in the string. I have a fairly long for-loop that goes through the various files, saves selected information from the relevant rows in the files to a temporary data frame, and then adds that temporary data frame to compsum_df (has the same column names as listed in colnames() in the for-loop):

for (i in 1:length(pwb_files_e1)) {
  if (file.info(pwb_files_e1[i])$size < 3000000) {
    next 
  } else {
    print(i)
    count_temp_df <- match(TRUE, is.na(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = "G6:G21"))) - 1
    temp_df_main <- data.frame(matrix(ncol = 15, nrow = 0))
    colnames(temp_df_main) <- c("Sales Director",
                              "Opportunity Name",
                              "File Path",
                              "State",
                              "Producer",
                              "Effective Date",
                              "Current Funding",
                              "Single Enr",
                              "Multi Enr",
                              "In-Force Max",
                              "In-Force Expected",
                              "Quoted ASO",
                              "% Better Spec Px",
                              "% Better at Max",
                              "% Better at Expected")
    for (j in 1:count_temp_df) {
      temp_df <- data.frame(matrix(ncol = 15, nrow = 1))
      colnames(temp_df) <- c("Sales Director",
                                  "Opportunity Name",
                                  "File Path",
                                  "State",
                                  "Producer",
                                  "Effective Date",
                                  "Current Funding",
                                  "Single Enr",
                                  "Multi Enr",
                                  "In-Force Max",
                                  "In-Force Expected",
                                  "Quoted ASO",
                                  "% Better Spec Px",
                                  "% Better at Max",
                                  "% Better at Expected")
      temp_df$'Sales Director' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("BE", j + 6), col_names = FALSE)[1, 1])
      temp_df$'Opportunity Name' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("H", j + 6), col_names = FALSE)[1, 1])
      temp_df$'File Path' <- pwb_files_e1[i]
      temp_df$'State' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("BP", j + 6), col_names = FALSE)[1, 1])
      temp_df$'Producer' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("BF", j + 6), col_names = FALSE)[1, 1])
      temp_df$'Effective Date' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("M", j + 6), col_names = FALSE)[1, 1])
      temp_df$'Current Funding' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "RateDev", range = paste0("D", j + 6), col_names = FALSE)[1, 1])
      temp_df$'Single Enr' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("J", j + 6), col_names = FALSE)[1, 1])
      temp_df$'Multi Enr' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("K", j + 6), col_names = FALSE)[1, 1])
      temp_df$'In-Force Max' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "RateDev", range = paste0("V", j + 6), col_names = FALSE)[1, 1])
      temp_df$'In-Force Expected' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "RateDev", range = paste0("W", j + 6), col_names = FALSE)[1, 1])
      temp_df$'Quoted ASO' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("BH", j + 6), col_names = FALSE)[1, 1])
      if (temp_df$'Current Funding' == "sf") {
          tempBspec <-
          temp_df$'% Better Spec Px' <- as.numeric(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("P", j + 6), col_names = FALSE)[1, 1]) /
                                        as.numeric(read_excel(path = pwb_files_e1[i], 
                                                              sheet = toupper(as.character(read_excel(path = pwb_files_e1[i], sheet = "RateDev", range = paste0("B", j + 6), col_names = FALSE)[1, 1])),
                                                              range = paste0("H21"), col_names = FALSE)[1, 1]) - 1
      } else {
        "NA"
      }
      temp_df$'% Better at Max' <- as.numeric(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("AZ", j + 6), col_names = FALSE)[1, 1])
      temp_df$'% Better at Expected' <- as.numeric(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("BA", j + 6), col_names = FALSE)[1, 1])
      temp_df_main <- rbindlist(list(temp_df_main, temp_df))
    }
    compsum_df <- rbindlist(list(compsum_df, temp_df_main))
  }
}

I believe the code is getting tripped up at this point:

temp_df$'Sales Director' <- as.character(read_excel(path = pwb_files_e1[i], sheet = "Rates&Fees", range = paste0("BE", j + 6), col_names = FALSE)[1, 1])

When my code tries to read a file path with apostrophes, it outputs something like this:

Error: Evaluation error: zip file 'T:/Programs/Target/Prospects/mikeâ€™s shop, inc/202201/mikeâ€™s shop, inc Pricing WB_V1.xlsm' cannot be opened.

There should be an apostrophe where "â€™" shows up. I've seen in other threads that the "/" is causing the issue, so I've tried replacing the "/" with "//":

pwb_files_e2 <- gsub("/", "//", pwb_files_e1)

It still outputs the same "Error: Evaluation error:" as above. However, when I directly type "T:/Programs/Target/Prospects/mike’s shop, inc/202201/mike’s shop, inc Pricing WB_V1.xlsm" into the path section of read_excel, the code is able to output the desired value. Maybe I need to store the file paths in a different manner than the character method I'm using? I've also tried Encoding, str_replace, etc. to get around the file path issues, but I'm all out of ideas.

Thanks!

i dont ever use special characters nor spaces in directory and file names. It makes things difficult later on down the road. — stefan_aus_hannover, May 31 '22 at 16:56
@stefan_aus_hannover yeah, i try to avoid using those characters as much as i can, but i don't have control over the names of the file paths. too many people work out of this directory, so they can use whatever characters they like in the pathways, so long as those characters are allowed under the file explorer's directory rules. — SucculentOtter, May 31 '22 at 17:01
Could you print the character paths in `pwb_files_e1` that contain the apostrophes? Maybe there are hints in the printed characters. — Abdur Rohman, May 31 '22 at 17:17
@SucculentOtter I just found that if you use the glob function, it will accept the special characters in the directory name. — stefan_aus_hannover, May 31 '22 at 17:18
https://stackoverflow.com/questions/47132804/how-to-make-r-read-files-from-directories-containing-non-english-characters — stefan_aus_hannover, May 31 '22 at 17:19
@AbdurRohman that was a great call! when i went back and looked at the problem file path again, i saw that it didn't use a normal "'", but rather "’". if you can see, the normal apostrophe is perfectly vertical, whereas the problem apostrophe has a slight tilt to it! thankfully, it looks like only one file in pwb_files_e1 has the tilted apostrophe (out of 1300 files). i think i'm just going to remove the file path with the tilted apostrophe from pwb_files_e1. kinda lazy, but i think accounting for 99.92% of the files is a good enough ratio haha. thanks for the suggestion! — SucculentOtter, May 31 '22 at 17:37

Read_excel having trouble with filepaths

0 Answers0