I would like to read a scanned PDF document into R using tesseract. In general, this already works quite well, but I have problems when the documents have a table structure. After some time of research I found out that there is a parameter to set the Page Segmentation Method (PSM). In fact, the default is designed for book pages, so changing this parameter should result in an increase in performance.
https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html#page-segmentation-method
Now I would like to set this PSM parameter, but I don't know where to find it. Most instructions and tutorials are for Python, but for my project I use R. I have already read that you can pass a named list to the options parameter, but I can't find a suitable method.
Your help would be greatly appreciated, I don't know where else to look.
Thanks in advance!