I'm doing for loop for 13 K pdf files, where it reads, pre-processes text, finds similarities and writes in txt. However, when I run the for loop after 760 pdf files, R session aborts. What can be the reason?
- I tried to write minimal code to reproduce the error. But I receive same issue.
- I tried to increase
memory_limit()
, it is also not the issue. - I tried to delete hidden files in the folder, like
Thumbs.db
, but same issue appears again. - I tried to divide 13 K pdf files into 4 folders, each (3,3K), and I got same error message
Error in file(file, ifelse(append, "a", "w")) : can not open the connection. In addition: There are 50 warnings() and R session aborted.
- When I run pdf_folder[759:762], it reads perfectly fine without abort.
folder_path <- "C: ...."
## get vector with all pdf names
pdf_folder <- list.files(folder.path)
## for loop over all pdf documents
for(s in 1:length(pdf_folder)){
# for(s in 1:2){
tryCatch({
## choose one pdf document from vector of strings
pdf_document_name <- pdf_folder[s]
## read pdf_document pdf into data.frame
pdf <- read_pdf(paste0(folder_path,"/",pdf_document_name))
print(s)
rm(pdf)
## first end trycatch block
}, error = function(e){print(paste("Error: PDF Document not used: ",pdf_document_name, sep =""))}
) ## end of trycatch
} ## end of for loop
# Error:
Error in file(file, ifelse(append, "a", "w")) : can not open the connection. In addition: There are 50 warnings()
The expected outcome is to read, pre-process all pdf documents in the folder.path.