I have a few hundreds of PDFs, which I need to change to texts. I do not need to save the text files, but, instead, I extract certain sentences from the text files. I have succeeded to do so in a single pdf file using pdftools.
Now, I need to be able to do it in all my pdfs. I tried the following, but didn't work properly.
files <- list.files(path = "my path", pattern = ".pdf", full.names = TRUE)
pdf2text <- function(x){
x <- pdftools::pdf_text() %>%
sapply(files, x) %>%
return()
}
Could anyone help me please? Thank you.
*It would be ideal if the texts are separated by their file names as a dataframe.