1

I have to store file data in a blob field using Firebird. Every time I execute the query, the database file size increases, even if it is executed with the same parameters.

Is it possible to avoid this problem?

update or insert into FILES(FILENAME, FILEVERSION, SHA512, FILESTATUS, FILEDATA)
  values(''%s'', ''%s'', ''%s'', %d, :FILEDATA)
  matching (FILENAME, FILEVERSION)

Edit: Sorry, I didn't specify the problem in detail, and I found a workaround for it: a first query examines if the file data is already stored, and the second query is executed only if it isn't. I think, it can be combined in a 'execute block' statement.

René Hoffmann
  • 2,766
  • 2
  • 20
  • 43
malom
  • 223
  • 2
  • 11
  • 1
    What exact firebird version and which server mode are you using? – René Hoffmann Nov 06 '19 at 16:05
  • Firebird 2.5 superserver – malom Nov 06 '19 at 16:21
  • That is not an exact version. Are you using 2.5.9 or an older version? – Mark Rotteveel Nov 06 '19 at 16:21
  • Firebird 2.5.1.26351, so it is an older version. Should I upgrade it to the last 2.5 and the problem will be solved? – malom Nov 06 '19 at 16:24
  • Yes, it is recommended to upgrade, because the version you are using has known security vulnerabilities. But do read the [release notes](https://www.firebirdsql.org/file/documentation/release_notes/html/en/2_5/rlsnotes25.html). Upgrading from 2.5.1 will require a backup and restore if you're using compound indexes (see this note for [2.5.3](https://www.firebirdsql.org/file/documentation/release_notes/html/en/2_5/notes-253.html)). I'm not sure if upgrading to 2.5.9 will resolve your problem though, I recall that recently a bug was fixed with temporary blobs, but that might have been in 3.0.5. – Mark Rotteveel Nov 06 '19 at 16:34
  • I'll see if I can reproduce in 2.5.9 and 3.0.5 (snapshot), however I won't have time until this weekend. I think there are two possibilities: Firebird doesn't clean up the blobs allocated for the parameter (which would be a bug), or the garbage collection of those blobs happens much later. – Mark Rotteveel Nov 06 '19 at 16:57
  • Mark it would only be a bug if he COMMITS transaction (or rolls back). While transacton is active, temporary blobs can not be freed - it will lead to use-after-free aka notorious "invalid blob id". So the question is how often does malom close his transactions – Arioch 'The Nov 06 '19 at 21:19
  • Also it feels weird seeing `''%s''` and `:FILEDATA` in one query - why not use all parameters and prepare query, rather than risk injections and keep re-parsing again and again ? – Arioch 'The Nov 06 '19 at 21:22
  • - I fabricated a simple class that encapsulates an IBQuery and an IBTransaction component, and the SQL query is executed by a separate instance of that class. So, the transaction will be committed after each execution of the SQL query. - Also this is the reason of the weird query parameters. When I realized that a large blob cannot be passed as a string, I hacked my code and added a method to deal with a single blob parameter. - Thanks for the note on upgrading FB server 2.5.1 . It is deployed on more than 100 PC-s and would be hard to upgrade them automatically. – malom Nov 07 '19 at 08:55
  • 1) just to be on safe side, can you check using Trace API that transactions are really closed (like, Commit not Commit-Retaining), statements closed/freed, etc. Sadly, Trace API does not expose blobs operations... One simple way to use the API is FBProfiler from SourceForge 2) while it would be hard to upgrade customers, you can run your test on FB 2.5.9 and maybe on FB 3 just to see if it will make a difference. – Arioch 'The Nov 07 '19 at 09:57

1 Answers1

1

After some testing with various versions, the only way to reproduce this behaviour is if garbage collection is inhibited by a long-running transaction. This means that while you are running those update or insert statements, another transaction is running concurrently, preventing the garbage collector from cleaning up those blobs.

Interestingly enough, when running multiple update or insert statements in a single transaction while another transaction is running in parallel, the database doesn't grow. This would seem to indicate that within the same transaction, the space of the blob is reused.

This problem seems to be fixed in Firebird 4 (checked with snapshot 4.0.0.1641), this Firebird version is still in development though.

In Firebird 3 and earlier, the solution/workaround would be to make sure that you don't have long-running transactions on your database that prevent garbage collection, or to try and run these updates in a single transaction as much as possible.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197