0

I'd like to be able to parse a PDF file using PHP and symfony 4 framework using certain keywords and split it into multiple PDF files at the keywords searched for.

I did some research and I found a lot of ways to parse PDF files using PHP but very few to parse and split the PDF file.

any web-sites I could go to? libraries I could use?

this is the code i used to extract the whole text from a pdf file. But my purpose is to extract specific paragraphes from the pdf, for example i want to extract the paragraph that starts from the word 'hello' to the word 'world'. public function extract(): Response { $parser = new \Smalot\PdfParser\Parser(); $CV = $parser->parseFile('C:\Users\lenovo\Desktop\papiers\CV\CVmolkahchaichi.pdf');

    // Retrieve all pages from the pdf file.
    $details = $CV->getDetails();
    $pages  = $CV->getPages();

    // Loop over each page to extract text.
    foreach ($pages as $page) {
        echo $page->getText();
        foreach ($details as $property => $value) {
            if (is_array($value)) {
                $value = implode(', ', $value);
            }
            echo $property . ' => ' . $value . "<br/>";
        }
        return new Response();

}

molyyy
  • 1
  • 3
  • Asking for libraries is discouraged on this site. That said, you have two parts, and you might need two tools. Part 1 is searching a PDF for a string, which is what I assume you mean by "parse a PDF file". This method should tell you what page numbers things are found on. Part 2 is extracting/deleting pages. (Duplicating a file and deleting content is the same as extracting.) See this [answer](https://stackoverflow.com/a/1882392/231316) for some options. If you actually mean "parse a PDF", that something totally else. – Chris Haas Mar 14 '22 at 19:45
  • Please provide enough code so others can better understand or reproduce the problem. – Community Mar 15 '22 at 02:48

0 Answers0