3

I have a pdf document and this page has an image of a graph plot, however legend of the plot is not part of the image. I am using pymupdf to extract get this image as following:

  for img in doc.getPageImageList(page_num, full=True):
    xref = img[0]
    pix = fitz.Pixmap(doc, xref)
    if pix.n - pix.alpha < 4:  # this is GRAY or RGB
      pix.writePNG(basePath+"/test_data/"+fund_type+"/%s-%s.png" % (filename+str(page_num), xref))
      print(filename + ' : ' + basePath + "/test_data/" + fund_type+ '/'+filename+ str(page_num) + '-'+str(xref), file=f)

Now, this gives me the image(a graph plot). I want to be able to capture some height below the image so that plot legend is also captured as part of the image. Is this possible using pymupdf? Any code pointers would also be helpful.

CuriousBug
  • 243
  • 1
  • 3
  • 16
  • `Page.getText("dict")` might be the option which returns blocks of image with coordinates. What you can do next is to use `Image` to crop the image with legend and coordinates. One thing that I am not sure is if that blocks of image contains legend as per your requirement but its an option. [https://pymupdf.readthedocs.io/en/latest/faq.html] – liamsuma Oct 15 '20 at 20:43

0 Answers0