2

Do you know any library that allows me to extract the text of a type A pdf to read it in PHP?

I have tried many libraries but none of them have been able to read the content I need help

  • this is an example of file: https://drive.google.com/open?id=10MlDAC8HHEFC9zK9byb56q4wbbERfb9Z – Giuseppe De Palma Aug 05 '19 at 19:41
  • Please refine your requirements: PDF text output can be in any direction, even changing directions. Maybe compare your requirement with what you get when displaying such a file, marking and copy-pasting all the text into some window. – U. Windl Aug 05 '19 at 20:34

1 Answers1

1

You could try PDF Parser, an open source library available in github

Will be something like this. But check the doc for further details

<?php

// lot of lines

// Parse pdf file and build necessary objects.
$parser = new \Smalot\PdfParser\Parser();
$pdf    = $parser->parseFile('document.pdf');

$text = $pdf->getText();
echo $text;

?>
carpinchosaurio
  • 1,175
  • 21
  • 44
  • You try with this: https://drive.google.com/open?id=1XlS-8Mcxg9B1TN5FfubqjfYm3Uv1uOxq – Giuseppe De Palma Aug 06 '19 at 21:17
  • The above code is just an example, you should read the documentation (first link I mentioned) to write the code that exactly matches your needs. However, I also tested the PDF you mentioned and it works using the first example on the documentation page. Print all the text that could be read up to the line that says "causing maggiore rifrazione.", which is the last part of the document. – carpinchosaurio Aug 07 '19 at 00:17