1

I have hundreds of excel files with a single column and a single sheet containing text. I am trying to write a loop that will 'Wrap Text' and align the single column in all of the files, preferably without reading the files into R.

I already set the style object as follows:

  style <- 
    openxlsx::createStyle(
    halign = "left",
    valign = "center",
    wrapText = T
   )

I have tried both a for loop and lapply but both only performs the openxlsx::addStyle to one file out of the 100s. Doesn't have to be openxlsx, it can be xlConnect or or any other package for xlsx files, even VBA is welcomed, if I can call it from R.

Please help.

Thanks in advance.

  • what is the addStyle loop command that you use? Can you supply a reproducible example of the issue? – Jonni Feb 04 '22 at 04:58

1 Answers1

1

This will probably be pretty slow and will most likely require reading the files into R, so I'm not sure how much this helps .

Libraries

 library(openxlsx)

Find files

First you need a list of all the excel files you have:

xlsx_paths <- list.files(path = "./folder_with_yr_excels", pattern = "xlsx$")

This will create a vector of all the .xlsx files you have in the folder.

Write function

Then we can write a function to do what you want to a single file:

 text_wrapper <- function(xlsx_path){


     #this links the file to R using the openxlsx package
      n3 <- openxlsx::loadWorkbook(file = xlsx_path)

     # this creates the style that you wanted:
         style <- 
         openxlsx::createStyle(
         halign = "left",
         valign = "center",
          wrapText = TRUE
       )


   # this adds the style to the excel file we just linked with R

    openxlsx::addStyle(n3, sheet = 1, cols = 1:400, rows= 1:400, style, gridExpand = TRUE)


   #this removes the .xlsx part from the path name
   xlsx_path2 <- sub(pattern = ".xlsx", 
                    replacement = "",
                    x= xlsx_path)

    # This is the naming standard I'll use: 
  #"original_file_name" -> "original_file_name_reformatted.xlsx"

    new_path <- paste(xlsx_path2, "_reformatted", ".xlsx", sep = "")

    # this saves the reformated excel file 
   saveWorkbook(n3, file = new_path, overwrite = TRUE)



 }

Notes

For other people coming across this post, here's a more in depth description of the openxlsx R package and some of the formatting things that can be done with it: https://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf

An annoying thing about this package is that you have to specify how many rows and columns you want to apply the style to, which become annoying when you don't know how many rows and columns you have. The not great workaround is to specify a large number of columns (in this case I did 400)

  openxlsx::addStyle(n3, sheet = 1, cols = 1:400, rows= 1:400, style, gridExpand = TRUE)

As of the time of posting, it sounds like there's not a better solution: https://github.com/awalker89/openxlsx/issues/439

Apply function to files

Anyways, the final step is to apply the function we wrote to all the excel files we found.

 lapply(paste("./folder_with_yr_excels",xlsx_paths,sep = ""), text_wrapper) 

Since that was done inside of a function we don't have to go back and delete intermediate data file. Yay!

Notes

The paste("./folder_with_yr_excels",xlsx_paths,sep = "") step adds the folder name back to the path name. There's an option in list.files() to keep the whole file path intact, but I like to keep track of which folder I'm dealing with by pasting the folder name back on at the end.

Russ
  • 1,385
  • 5
  • 17