0

Is there a way to extract images from pdf using R and save them into a folder? there are a lot of similar questions regarding other programming languages and there is apparently a way to do this in python, was wondering if the same work can be replicated in r https://www.thepythoncode.com/article/extract-pdf-images-in-python

there is pdftools package in r but does not sound like it can help much with images, only reads text and there is an option for ocr, I just want to extract the images and store them into a folder.

I can try to use reticulate package to use this python method in r but I won't be able to loop / map it as I would like. That's why I was asking if anyone knows a way in R.

thank you.

Bahi8482
  • 489
  • 5
  • 15
  • 1
    What about [this](https://www.google.com/amp/s/rdrr.io/a/cran/metagear/src/R/PDF_extractImages.R) – at80 Nov 13 '20 at 18:22
  • @at80 thank you. could not install the package yet, will try on a different computer. but looks like it does exactly what I want plus many additional useful functions ! – Bahi8482 Nov 14 '20 at 01:57
  • with the R package pdftools, you can use pdf_convert et and pdf_render_page. I use this at my job and it works very well to convert PDF pages to images – Emmanuel Hamel Sep 15 '22 at 21:43

1 Answers1

0

You can try something like this :

library(pdftools)
path_To_PDF <- "C:/my_pdf.pdf"
pdf_convert(path_To_PDF)
Emmanuel Hamel
  • 1,769
  • 7
  • 19