I have a Wordpress site installed on a VPS with Debian 11. One of the functionalities is reading uploaded PDF documents using the XPDF library and PHP wrapper PHP-XPDF: https://github.com/alchemy-fr/PHP-XPDF, which uses XPDFReader: https://www.xpdfreader.com/index.html
Basically I want to write the contents of the PDF to a string and then write that to an ACF custom field.
But I have a problem with the path to the PDF file. I tried via URL (https://silkstack.com/wp-content/uploads/2023/06/document.pdf) and via file path: /var/www/html/wp-content/uploads/2023/ 06/document.pdf, in both cases I get an 'is not a valid file' error.
Xpdf/pdftotext on the server is working normally If I run the command directly through the shell "pdftotext /var/www/html/wp-content/uploads/2023/06/document.pdf" a txt file with the content of the PDF is saved in the same location.
I tested with a simple PHP script and with a PDF document in the same folder (in this case only the file name without the path is specified in PHP) and in this case the script works. Example:
<?php
require __DIR__ . '/vendor/autoload.php';
$logger = null;
$pdfToText = XPDF\PdfToText::create(array(
'pdftotext.binaries' => '/usr/bin/xpdf',
'pdftotext.timeout' => 30, // timeout for the underlying process
), $logger);
$text = $pdfToText->getText('sample.pdf');
// remove non-latin characters
$clean_txt = preg_replace('/[^\00-\255]+/u', '', $text);
var_dump($clean_txt);
Any idea how I would set the file path for PHP-XPDF?
Update 17.6.2023:
I'm aware that URL path makes no sense to use, so if I use file path /var/www/... I get this error:
PHP Fatal error: Uncaught Alchemy\BinaryDriver\Exception\ExecutionFailureException: pdftotext failed to execute command '/usr/bin/xpdf' '-raw' '-nopgbrk' '-enc' 'UTF-8' '-eol' '-unix' '/var/www/html/wp-content/uploads/2023/06/document.pdf' '/tmp/xpdfWfGd3O' in /var/www/html/wp-content/themes/child_theme/vendor/alchemy/binary-driver/src/Alchemy/BinaryDriver/ProcessRunner.php:100
Could it be permissions problem? www-data has permissions 0755 on folders and 0644 on files