4

I want save uploaded images in a bytea column in my PostgreSQL database. I'm looking for advice on how to how to save images from Rails into a bytea column, preferably with examples.

I use Rails 3.1 with the "pg" driver to connect to PostgreSQL.

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
user1466717
  • 779
  • 1
  • 10
  • 23

1 Answers1

9

It's often not a good idea to store images in the database its self

See the discussions on is it better to store images in a BLOB or just the URL? and Files - in the database or not?. Be aware that those questions and their answers aren't about PostgreSQL specifically.

There are some PostgreSQL specific wrinkles to this. PostgreSQL doesn't have any facilities for incremental dumps*, so if you're using pg_dump backups you have to dump all that image data for every backup. Storage space and transfer time can be a concern, especially since you should be keeping several weeks' worth of backups, not just a single most recent backup.

If the images are large or numerous you might want to consider storing images in the file system unless you have a strong need for transactional, ACID-compliant access to them. Store file names in the database, or just establish a convention of file naming based on a useful key. That way you can do easy incremental backups of the image directory, managing it separately to the database proper.

If you store the images in the FS you can't easily access them via the PostgreSQL database connection. OTOH you can serve them directly over HTTP directly from the file system much more efficiently than you could ever hope to when you have to query them from the DB first. In particular you can use sendfile() from rails if your images are on the FS, but not from a database.

If you really must store the images in the DB

... then it's conceptually the same as in .NET, but the exact details depend on the Pg driver you're using, which you didn't specify.

There are two ways to do it:

  • Store and retrieve bytea, as you asked about; and
  • Use the built-in large object support, which is often preferable to using bytea.

For small images where bytea is OK:

  • Read the image data from the client into a local variable
  • Insert that into the DB by passing the variable as bytea. Assuming you're using the ruby-pg driver the test_binary_values example from the driver should help you.

For bigger images (more than a few megabytes) use lo instead:

For bigger images please don't use bytea. It's theoretical max may be 2GB, but in practice you need 3x the RAM (or more) as the image size would suggest so you should avoid using bytea for large images or other large binary data.

PostgreSQL has a dedicated lo (large object) type for that. On 9.1 just:

CREATE EXTENSION lo;
CREATE TABLE some_images(id serial primary key, lo image_data not null);

... then use lo_import to read the data from a temporary file that's on disk, so you don't have to fit the whole thing in RAM at once.

The driver ruby-pg provides wrapper calls for lo_create, lo_open, etc, and provides a lo_import for local file access too. See this useful example.

Please use large objects rather than bytea.


* Incremental backup is possible with streaming replication, PITR / WAL archiving, etc, but again increasing the DB size can complicate things like WAL management. Anyway, unless you're an expert (or "brave") you should be taking pg_dump backups rather than relying on repliation and PITR alone. Putting images in your DB will also - by increasing the size of your DB - greatly slow down pg_basebackup, which can be important in failover scenarios.

The adminpack offers local file access via a Pg connection for superusers. Your webapp user should never have superuser rights or even ownership of the tables it works with, though. Do your file reads and writes via a separate secure channel like WebDAV.

Community
  • 1
  • 1
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • The idea is as follows. My files will be stored in the database and file system. If the file is removed from the file system, it will again be automatically loaded from the database file and stored in a folder. – user1466717 Aug 13 '12 at 07:07
  • 2
    @user1466717 Urk, why? That's the worst of both worlds. What problem are you trying to solve with this approach? What's the underlying problem, the reason why you want to do that? – Craig Ringer Aug 13 '12 at 07:11
  • @user1466717 BTW, answer updated with links to [the examples in the ruby-pg tests](https://bitbucket.org/ged/ruby-pg/src/ef533f731814/sample/test_binary_values.rb) for `bytea` access. – Craig Ringer Aug 13 '12 at 07:16
  • When you view the page images will be issued from the file system. If the transfer or accidental deletion of files, the first page load is a picture from the base-loaded and configured from scratch. This will save all the images and do not load the database each time the page loads. – user1466717 Aug 13 '12 at 07:16
  • @user1466717: whether to store image in the file system or in the database, see also http://dba.stackexchange.com/questions/736/is-it-better-to-store-images-in-a-blob-or-just-the-url – j.p. Aug 13 '12 at 07:17
  • @user1466717 OK, that sounds like a very strange approach, but whatever works for you I guess. To me it sounds more like a job for the application, copying default images from a read-only directory, zip archive, or whatever. Still, if they're small images or you don't mind big backups it won't hurt. – Craig Ringer Aug 13 '12 at 07:20
  • In case of storing the images in the filesystem and keeping only a link to the image in the database, PostgreSQL can also automatically delete the file when the link is deleted, see [here](http://www.devarticles.com/c/a/Ruby-on-Rails/More-Advanced-Database-Features-and-Rails/) – j.p. Aug 13 '12 at 13:00
  • 1
    @jug That refers to large objects, which are *within* the database. The `lo` pseudo-type takes care of this for you with no custom coding, so it's a safer option than hand-rolling triggers for the job. For managing external files you can `NOTIFY` a `LISTEN`ing process to tell it to delete the file; ideally your database shouldn't have any more file system access than it absolutely needs to so it shouldn't be deleting them its self. – Craig Ringer Aug 13 '12 at 13:54
  • @CraigRinger: Thanks for the info, somehow I misunderstood that paragraph. – j.p. Aug 14 '12 at 13:41