29

For the sake of simplicity, suppose I'm developing a mobile app like Instagram. Users can download images from the server, and upload images of their own. Currently the server stores all images (in reality, just small thumbnails) in a MySQL database as BLOBs. It seems that the most common way to transfer images is by using Base64 encoding, which leaves me with two options:

  1. Server stores all images as BLOBs. To upload an image, client encodes it into Base64 string, then sends it to the server. Server decodes image BACK into binary format and stores it as BLOB in the database. When client requests an image, server re-encodes the image as Base64 string and sends it to the client, who then decodes it back to binary for display.
  2. Server stores all images as Base64 strings. To upload an image, client encodes it into Base64 string and sends it to the server. Server does no encoding or decoding, but simply stores the string in the database. When client requests an image, the Base64 string is returned to the client, who then decodes it for display.

Clearly, option #1 requires significantly more processing on the server, as images must be encoded/decoded with every single request. This makes me lean toward option #2, but some research has suggested that storing Base64 string in MySQL is much less efficient than storing the image directly as BLOB, and is generally discouraged.

I'm certainly not the first person to encounter this situation, so does anybody have suggestions on the best way to make this work?

Zeeshan Hassan Memon
  • 8,105
  • 4
  • 43
  • 57
Hundley
  • 3,167
  • 3
  • 23
  • 45
  • 2
    Option #3 is no images in the database in the first place. They have a system for this: The filesystem. – tadman Mar 26 '15 at 16:49
  • 2
    I started out storing file paths, but some research on the most recent versions of MySQL has suggested that it's actually more efficient to store small files (under 1 or 2 mb) as BLOBs. My images are only a few kb, so it's much easier to maintain this way. – Hundley Mar 26 '15 at 21:21
  • Backing up a database full of BLOBs is an absolute nightmare. It's expensive to replicate these, expensive to back them up, extremely painful to restore them, and excruciating to split them up when they get too big. Unless you're dealing with inconsequential amounts of data, this will blow up on you eventually. Files on disk can be replicated easily using something as rudimentary as `rsync`. Same is not true for MySQL. – tadman Mar 26 '15 at 21:31

3 Answers3

12

JSON assumes utf8, hence is incompatible with images unless they are encoded in some way.

Base64 is almost exactly 8/6 times as bulky as binary (BLOB). One could argue that it is easily affordable. 3000 bytes becomes about 4000 bytes.

Everyone should be able to accept arbitrary 8-bit codes, but not everybody does. Base-64 may be the simplest and overall best compromise for not having to deal with 8-bit data.

Since these are "small", I would store them in a table, not a file. I would, however, store them in a separate table and JOIN by an appropriate id when you need them. This allows queries that don't need the image to run faster because they are not stepping over the BLOBs.

Technically, TEXT CHARACTER SET ascii COLLATE ascii_bin would do, but BLOB makes it clearer that there is not really any usable text in the column.

Zeeshan Hassan Memon
  • 8,105
  • 4
  • 43
  • 57
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • To clarify, are you suggesting that I store the images as Base64 in a BLOB, or the binary data itself and encode on every fetch? As for your suggestion with JOINing - that's precisely what I'm doing, along with a Sphinx index. – Hundley Mar 26 '15 at 23:16
  • Base64 throughout its life. Then you don't need to do escaping anywhere. So, yes, Base64 in a BLOB without encode except on initial INSERT. This is _my opinion_. – Rick James Mar 26 '15 at 23:22
  • Interesting, maybe I'll give that a shot. It would definitely save a lot of CPU since fetches occur much more often than inserts. – Hundley Mar 26 '15 at 23:46
0

Why would you base64-encode the images on the wire? I think you're starting from a wrong assumption.

Julian Reschke
  • 40,156
  • 8
  • 95
  • 98
  • Could you clarify? Right now I'm sending all images as Base64-encoded JSON objects, with some metadata as well (so the client knows what to do with the image upon receipt). Is there a way to send the images as binary objects (no encoding) WITH metadata? What about the client POSTing a new image? My server uses Node.js with Express for HTTP, as well as [ws](https://github.com/websockets/ws) for some WebSocket features. – Hundley Mar 26 '15 at 21:29
  • if we dont send it over the wire, which i agree is bad. what suggestions do you have if client end wants to show an image that is stored in a database ? i guess a weblink is the only alternative then ? but our database already has thousands of base64 encoded images in a column. any help appreicated. – j2emanue Jul 25 '18 at 03:46
  • This should be a comment, it's certainly not an answer – Mario Gil Sep 02 '21 at 18:33
0

I don't see why the DB Server shouldn't always keep binary data in it's native form. Thus, use a BLOB. (But even if you did store the data in Base64 string, there is no need concern yourself about encoding/decoding performance because the IO's impact will be more significant.)

I don't get why the client should send the data in base64 though. Why not just "stream" it using a simple HTTP call?

Theodore Zographos
  • 2,215
  • 1
  • 24
  • 23
  • Switching to a "binary stream" requires some shifting of gears (in the protocol, etc). HTTP accepts only a limited subset of 256 different byte combinations and has kludges to accommodate other combos. Html-entities is one such, but it is quite bulky. Backslashes is another, but doesn't cover everything. Url-encoding 8-bit codes into 3 bytes. Etc. – Rick James Nov 06 '22 at 15:44