4

I'm working with Python3 and I want to simulate writing to a file, but without actually creating a file.

For example, my specific case is as follows:

merger = PdfFileMerger()

for pdf in files_to_merge:
    merger.append(pdf)

merger.write('result.pdf')  # This creates a file. I want to avoid this
merger.close()

# pdf -> binary
with open('result.pdf', mode='rb') as file:  # Conversely. I don't want to read the data from an actual file
    file_content = file.read()

I think StringIO is a good candidate for this situation, but I don't know how to use it in this case, which would be writing to a StringIO object. It would look something like this:

output = StringIO()
output.write('This goes into the buffer. ')

# Retrieve the value written
print output.getvalue()

output.close() # discard buffer memory

# Initialize a read buffer
input = StringIO('Inital value for read buffer')

# Read from the buffer
print input.read()
Xar
  • 7,572
  • 19
  • 56
  • 80
  • I don't think I understand your question – roganjosh Nov 05 '19 at 20:28
  • 1
    Also, that isn't Python 3, it's Python 2 – roganjosh Nov 05 '19 at 20:30
  • @roganjosh I think in Python there are "file-like objects", which enable us to simulate working with files, but without having to actually create a real file. `StringIO` allows us to work with those file-like objects by creating a buffer. I'm asking what's the way to simulate writing to file, by using a file-like object. – Xar Nov 05 '19 at 20:33

1 Answers1

4

Since the PdfFileMerger.write method supports writing to file-like objects, you can simply make the PdfFileMerger object write to a BytesIO object instead:

from io import BytesIO

merger = PdfFileMerger()

for pdf in files_to_merge:
    merger.append(pdf)

output = BytesIO()
merger.write(output)
merger.close()

file_content = output.getvalue()
blhsing
  • 91,368
  • 6
  • 71
  • 106
  • 1
    Thanks @blhsing! But I'm getting this error: `TypeError: string argument expected, got 'bytes'` when trying to do `merger.write(output)`. – Xar Nov 05 '19 at 20:52
  • 1
    I see. I forgot that PDF files are binary files, and you would therefore have to use `io.BytesIO` instead of `io.StringIO`. – blhsing Nov 05 '19 at 20:54
  • thanks! That worked! Now I'm trying to make the second part of my snippet work as well with StringIO, which is reading the file-like object. It is looking like this: `with open(output, mode='rb') as file: file_content = file.read()` and I'm getting this error: `Exception Value: expected str, bytes or os.PathLike object, not _io.BytesIO` – Xar Nov 05 '19 at 21:03
  • 1
    Ok, no worries I handled the reading in a much more convenient way, by just doing `output.getvalue()` – Xar Nov 05 '19 at 21:22
  • 1
    Yes exactly. Was about write that. I've updated my answer accordingly regardless. Glad to be of help. – blhsing Nov 05 '19 at 21:28
  • 1
    I also am using `PdfFileMerger`, but in my case, I wanted to perform all PDF merges without any persistence to the local file system (at the end, it's shipped to an S3 location). A minor tweak to the answer and it worked perfectly! – bsplosion Apr 23 '20 at 22:47