0

Example of PDF: "Smith#00$Consolidated_Performance.pdf"

The goal is to add a bookmark to page 1 of each PDF based on the filename.

(Bookmark name in example would be "Consolidated Performance")

import os
from openpyxl import load_workbook
from PyPDF2 import PdfFileMerger

cdir = "Directory of PDF" # Current directory
pdfcdir = [filename for filename in os.listdir(cdir) if filename.endswith(".pdf")]

def addbookmark(f):
    output = PdfFileMerger()
    name = os.path.splitext(os.path.basename(f))[0] # Split filename from .pdf extension
    dp = name.index("$") + 1 # Find position of $ sign
    bookmarkname = name[dp:].replace("_", " ") # replace underscores with spaces
    output.addBookmark(bookmarkname, 0, parent=None) # Add bookmark
    output.append(open(f, 'rb'))
    output.write(open(f, 'wb'))

for f in pdfcdir:
    addbookmark(f)

The UDF works fine when applied to individual PDFs, but it won't add the bookmarks when put into the loop at the bottom of the code. Any ideas on how to make the UDF loop through all PDFs within pdfcdir?

xTHx
  • 3
  • 1
  • Sure... just one question. What's a UDF? – kindall Mar 28 '17 at 21:51
  • @kindall I'm _guessing_ it's a [user-defined function](https://en.wikipedia.org/wiki/User-defined_function) but perhaps not used correctly. In relation to this question, `output.append(open(f, 'rb'))` and `output.write(open(f, 'wb'))` do not make much sense. – roganjosh Mar 28 '17 at 21:55
  • 1
    Ah. Having grown up in Ohio, I was thinking United Dairy Farmers... – kindall Mar 28 '17 at 21:56
  • @kindall tbh, the more I read the question/code, the closer I come to the baseline of substituting in "United Dairy Farming" into the question and getting the same understanding. I might be well off on this one; it needs clarifying :) – roganjosh Mar 28 '17 at 22:04

1 Answers1

0

I'm pretty sure that the issue you're having has nothing to do with the loop. Rather, you're passing just the filenames and not including the directory path. It's trying to open these files in the script's current working directory (the directory the script is in, by default) rather than in the directory you read the filenames from.

So, join the directory name with each file name when calling your function.

for f in pdfcdir:
    addbookmark(os.path.join(cdir, f))
kindall
  • 178,883
  • 35
  • 278
  • 309
  • Now when I try to append the pdfs with the new bookmarks they don't show up on the final merged pdf. I'm using -> .... for pdf in PDFfiles: merger.append(open(os.path.join(cdir,pdf), 'rb'), import_bookmarks=True) then writing to the new pdf. Any idea why the bookmarks won't write to the new pdf? – xTHx Mar 29 '17 at 13:34
  • You should probably ask a new question for that, so everyone sees it rather than just me. – kindall Mar 29 '17 at 17:27
  • I was able to resolve this by applying the bookmark parameter of .append – xTHx Mar 29 '17 at 18:21