2

I'd like to use PyMuPDF : I'd like to split a pdf, with for each splitted file, a file named with the name of the bookmark, with only page

I've succefully my files, for exemple 4 PDF files for a 4 pages PDF source.... but in the several pdf, I don't have one page but with a random number of page ?

import sys, fitz

file = '/home/ilyes/Bulletins_Originaux.pdf'
bookmark = ''
try:
    doc = fitz.open(file) 
    toc = doc.getToC(simple = True)
    
except Exception as e:
    print(e)


for i in range(len(toc)):
    
    documentPdfCible=toc[i][1]
    documentPdfCibleSansSlash=documentPdfCible.replace("/","-")
    
    numeroPage=toc[i][2]
    
    
    pagedebut=numeroPage
    pagefin=numeroPage + 1
    
    print (pagedebut)
    print (pagefin)
     
    doc2 = fitz.open(file)
    
    doc2.insertPDF(doc, from_page = pagedebut, to_page = pagefin, start_at = 0)
    
    doc2.save('/home/ilyes/' + documentPdfCibleSansSlash + ".pdf")
    doc2.close
    
   
    

Could you tell me what's wrong ? Maybee because I use always "doc2" in the loop ?

Thanks you,

Abou Ilyès

1 Answers1

3

Seems weird, that you open the same document twice.

You open your pdf file at doc = fitz.open(file) and again at doc2 = fitz.open(file).

Then you insert pages into the same file by doc2.insertPDF(doc, from_page = pagedebut, to_page = pagefin, start_at = 0).

Of course the doc files toc will get messed up completely by "randomly" inserting pages.

I recommend to replace doc2 = fitz.open(file) with doc2 = fitz.open()

This will create an empty "in memory" pdf (see the documentation), in which you can then insert the pages you need from doc. Then save this as a new pdf by its bookmark title by running

doc2.save('/home/ilyes/' + documentPdfCibleSansSlash + ".pdf")

Red
  • 144
  • 1
  • 7