0

i try to use PyPDF2 to merge 2 pdf pages into one.

Here pdf example files http://ge.tt/9IvaIo01

But when i try to merge, i recive copy of each page from top and bottom. Here sample which demonstrate when use mergeTranslatedPage on page 0 and page 1 you recive 2 copy of page 0 and non of page 1.

Maybe its my fault or misunderstand. Thank you.

from PyPDF2 import PdfFileReader,PdfFileWriter
import os

api = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'api')
input = PdfFileReader(file(api+"/example_doc_in.pdf",'rb'))
output = PdfFileWriter()
#Some logic with merging page
input.getPage(0).mergeTranslatedPage(page2=input.getPage(1),tx='0',ty='-384')
output.addPage(input.getPage(0))
outputStream = file(api+'/example_doc_out.pdf','wb')
output.write(outputStream)
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
Darius
  • 180
  • 1
  • 13

1 Answers1

0

So thanks to https://www.freelancer.com/u/ils7.html for spending time to find bug. The solution is: You need to replace _mergeResources function in pdf.py

with:

def _mergeResources(res1, res2, resource):
        newRes = DictionaryObject()
        newRes.update(res1.get(resource, DictionaryObject()).getObject())
        page2Res = res2.get(resource, DictionaryObject()).getObject()
        renameRes = {}
        for key in page2Res.keys():
            if newRes.has_key(key) and newRes[key] == page2Res[key]:
                newname = NameObject(key + "renamed")
                renameRes[key] = newname
                newRes[newname] = page2Res[key]
            elif not newRes.has_key(key):
                newRes[key] = page2Res.raw_get(key)
        return newRes, renameRes
    _mergeResources = staticmethod(_mergeResources)

this code, the error was in `

newRes[key] == page2Res[key]:` 

this place, was:

newRes[key] != page2Res[key]:`

Thanks again to ils

Darius
  • 180
  • 1
  • 13