why saving a file that I opened with fitz will change its size?

Question

I looked for what opening a file with fitz do to the file, but didn't find anything. The code is simple:

import fitz
doc = fitz.open('a.pdf')
doc.save('b.pdf')

What I don't understand is why this will change the pdf size. With the file I tried, its size went from 829kb to 854kb.

I am not confortable with this because I would like to change a characteristic of a large number of files and I can't do it before being sure this won't alter them in any sense but in the characteristic I want to change.

BTW, what I want is just set the inner title of a pdf to be equal to the shown name of its file.

import fitz
doc = fitz.open(r'a.pdf')
doc.metadata['title']=None
doc.setMetadata(doc.metadata)
doc.save(r'b.pdf')

Can I asume I won't lose some information in this second example? Why the change in size when I just open and save the file in the first example?

score 0 · Answer 1 · answered Aug 14 '23 at 10:03

As for me it helping with:

import fitz

doc = fitz.open(r'a.pdf')

# to clear metadata dict
doc.metadata = {}

# to clear all xml metadata
doc.del_xml_metadata()

# garbage=4 -- is cleaning duplications!
doc.save(filename=r'b.pdf',
         garbage=4)

Usually it's getting more than 30% less size.

garbage (int):

0 = none
1 = remove unused (unreferenced) objects.
2 = in addition to 1, compact the xref table.
3 = in addition to 2, merge duplicate objects.
4 = in addition to 3, check stream objects for duplication. This may be slow because such data are typically large.

You’re right! But according on the my examples, there’s are was off files with metadata trash. In that I am adding 3 code positions with dict, xml tags and garbage to understand tree ways to clean the file. + garbage documentation link — Timur U, Aug 14 '23 at 13:08

score -1 · Answer 2 · answered Jun 25 '21 at 19:24

-1

You should check the metadata of the document. It may have information on modification date, saving date, etc., that could explain the increased size.

answered Jun 25 '21 at 19:24

Gustavo Bedendo

1

why saving a file that I opened with fitz will change its size?

2 Answers2