Extraction of position of an image in a PDF file

Question

I am using pyMuPdf library to extract images from a pdf file. I want to get the position of the images (origin) and the size of them.
I could get the sizes. However I can't get the position correctly using:

def extract_images_from_pdf(_input_pdf_file_name, _output_folder):
    _pdf_file_document = fitz.open(_input_pdf_file_name)

    for _page_index, _page in enumerate(_pdf_file_document):  # Get the page itself
        _images_list = _pdf_file_document.get_page_images(pno=_page_index, full=True)  # Get image list for this page
    
    for _image_index, _image in enumerate(_images_list):
        _xref = _image[0]
        _base_image = _pdf_file_document.extract_image(_xref)
        _image_bytes = _base_image["image"]
        _image = PILImage.open(BytesIO(_image_bytes))
        
        _output_image_name = f"{_output_folder}/image_{_image_index + 1:04d}.png"
        _image.save(open(_output_image_name, "wb"))

I can process each images and extract them.
However,I am having trouble retrieving the original position of those images. I want to get each pages as an image, getting each images in that page and then get the origin point and the size of those extracted images. I am using the following code to get the origin, but from one reason, I am not getting the origin position correctly.

def get_image_origins(_input_pdf_file_name, _page_index):
    _pdf_file_document = fitz.open(_input_pdf_file_name)
    _image_list = _pdf_file_document.get_page_images(pno=_page_index, full=True)
    _image_bounding_boxes = []
    
    for _image_index, _image_item in enumerate(_image_list):
        _image_code_name = _image_item[7]
        
        # The format of _image_bounding_box is (x_min, y_min, x_max, y_max) for each images inside the page.
        _image_rects = _pdf_file_document[_page_index].get_image_rects(_image_code_name, transform=True)
        _image_box = _pdf_file_document[_page_index].get_image_bbox(_image_item, transform=True)
        
        if len(_image_rects) > 0:
            _image_bounding_box, _ = _image_rects[0]
            _image_bounding_boxes.append(_image_bounding_box)

    return _image_bounding_boxes

Please help.

What do you mean by "not getting the origin position correctly"? I assume "origin" here means the bbox covered by the image on the page? You are aware, that you not only request the bbox, but also the transformation matrix of the image? The transformation matrix executes scaling and rotation to make the image fit in its target bbox. In addition, get_image_bbox and get_image_rects do much the same: you only need one of them. If you have more questions, I suggest to visit the PyMuPDF homepage "Discussions". — Jorj McKie, May 23 '23 at 14:29

Extraction of position of an image in a PDF file

0 Answers0