0

i am trying to rename a list of pdf files by extracting the name from the file using PyPdf. i tried to use a for loop to rename the files but i always get an error with code 32 saying that the file is being used by another process. I am using python2.7 Here's my code

import os, glob
from pyPdf import PdfFileWriter, PdfFileReader

# this function extracts the name of the file
def getName(filepath):
    output = PdfFileWriter()
    input = PdfFileReader(file(filepath, "rb"))
    output.addPage(input.getPage(0))
    outputStream = file(filepath + '.txt', 'w')
    output.write(outputStream)
    outputStream.close()

    outText = open(filepath + '.txt', 'rb')
    textString = outText.read()
    outText.close()

    nameStart = textString.find('default">')
    nameEnd = textString.find('_SATB', nameStart)
    nameEnd2 = textString.find('</rdf:li>', nameStart)

    if nameStart:
        testName = textString[nameStart+9:nameEnd]
        if len(testName) <= 100:
            name = testName + '.pdf'
        else:
            name = textString[nameStart+9:nameEnd2] + '.pdf'
    return name


pdfFiles = glob.glob('*.pdf')
m = len(pdfFiles)
for each in pdfFiles:
    newName = getName(each)
    os.rename(each, newName)
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
chidimo
  • 2,684
  • 3
  • 32
  • 47
  • Post error traceback and number of line when it apperas please. – Feanor Nov 14 '13 at 12:31
  • Are you on Windows? And is there maybe someone having an open file handle on any of the files you try to rename? – Alfe Nov 14 '13 at 12:42
  • yes, i'm on windows. please how do i post a picture? i have a screenshot of the error on my command window – chidimo Nov 14 '13 at 12:48
  • In the editor there is a picture icon in the tool bar. You can upload pictures there. – Alfe Nov 14 '13 at 13:01

3 Answers3

1

Consider using the with directive of Python. With it you do not need to handle closing the file yourself:

def getName(filepath):
    output = PdfFileWriter()
    with file(filepath, "rb") as pdfFile:
        input = PdfFileReader(pdfFile)
        ...
Alfe
  • 56,346
  • 20
  • 107
  • 159
0

You're not closing the input stream (the file) used by the pdf reader. Thus, when you try to rename the file, it's still open.

So, instead of this:

input = PdfFileReader(file(filepath, "rb"))

Try this:

inputStream = file(filepath, "rb")
input = PdfFileReader(inputStream)
(... when done with this file...)
inputStream.close()
elmart
  • 2,324
  • 15
  • 17
  • How do i share this answer with other users who have contributed? – chidimo Nov 14 '13 at 13:32
  • If you consider an answer definitive (it solves your problem), then you select it as "the accepted answer". To do that, click the check mark to the left of the paragraph. If an answer is helpful but not definitive, you can instead just upvote it (click the up arrow to the left of the paragraph). – elmart Nov 14 '13 at 17:44
0

It does not look like you close the file object associated with the PDF reader object. Though maybe at tne end of the function it is closed automatically, but to be sure you might want to create a separate file object which you pass to the PdfFileReader and then close the file handle when done. Then rename.

The below was from SO: How to close pyPDF "PdfFileReader" Class file handle import os.path from pyPdf import PdfFileReader

fname = 'my.pdf'
fh = file(fname, "rb")
input = PdfFileReader(fh)

fh.close()
os.rename(fname, 'my_renamed.pdf')
Community
  • 1
  • 1
Paul
  • 7,155
  • 8
  • 41
  • 40