6

Maybe this question should be asked on https://dba.stackexchange.com/ instead, I'm not sure. Please advise in comments or move it there.

For this project I'm using MySQL 5.6.19 hosted at Amazon RDS.

Summary

I'm going to store photos in the database in a BLOB column in InnoDB table and I would like to know the optimal way to do it. I'm looking for official documentation or some method(s) that would allow to compare different variants.

When searching for this topic there are a lot of discussions and questions about whether it is better to store binary files in the database BLOB or in the file system with the database having only file paths and names. Such discussion is beyond the scope of this question. For this project I need consistency and referential integrity, so files are going to be stored in BLOB, the question is in details of how exactly to do it.

Database schema

Here is the relevant part of the schema (so far). There is a table Contracts with some general information about each contract and primary ID key. For each Contract there can be several (~10) photos taken, so I have a table ContractPhotos:

CREATE TABLE `ContractPhotos` (
  `ID` int(11) NOT NULL,
  `ContractID` int(11) NOT NULL,
  `PhotoDateTime` datetime NOT NULL,
  PRIMARY KEY (`ID`),
  KEY `IX_ContractID` (`ContractID`),
  CONSTRAINT `FK_ContractPhotos_Contracts` FOREIGN KEY (`ContractID`) REFERENCES `Contracts` (`ID`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8

For each photo I will store original full resolution image plus few scaled down versions, so I have a table ContractPhotoVersions:

CREATE TABLE `ContractPhotoVersions` (
  `ID` int(11) NOT NULL,
  `ContractPhotoID` int(11) NOT NULL,
  `PhotoVersionTypeID` int(11) NOT NULL,
  `PhotoWidth` int(11) NOT NULL,
  `PhotoHeight` int(11) NOT NULL,
  `FileSize` int(11) NOT NULL,
  `FileMD5` char(32) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
  PRIMARY KEY (`ID`),
  KEY `IX_ContractPhotoID` (`ContractPhotoID`),
  CONSTRAINT `FK_ContractPhotoVersions_ContractPhotos` FOREIGN KEY (`ContractPhotoID`) REFERENCES `ContractPhotos` (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Finally, there is a table that holds actual binary data of all images. I know that MySQL allows to store up to 4GB in LONGBLOB columns, but during my search I came across another MySQL limitation: max_allowed_packet. On my instance of MySQL this variable is 4MB. My understanding of this variable after reading the docs is that effectively, a single row can't exceed 4MB. It is pretty normal to have a photo that is more than 4MB, so in order to be able to INSERT and SELECT such files I intend to split the file into small chunks:

CREATE TABLE `PhotoChunks` (
  `ID` int(11) NOT NULL,
  `ContractPhotoVersionID` int(11) NOT NULL,
  `ChunkNumber` int(11) NOT NULL,
  `ChunkSize` int(11) NOT NULL,
  `ChunkData` blob NOT NULL,
  PRIMARY KEY (`ID`),
  UNIQUE KEY `IX_ContractPhotoVersionID_ChunkNumber` (`ContractPhotoVersionID`,`ChunkNumber`),
  CONSTRAINT `FK_PhotoChunks_ContractPhotoVersions` FOREIGN KEY (`ContractPhotoVersionID`) REFERENCES `ContractPhotoVersions` (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Besides, I will be able to upload large photos into the database few chunks at a time and resume upload when connection drops.

Data volume

The estimated volume of data is 40,000 full-resolution photos at ~5MB each => 200GB. The scaled down versions will most likely be 800x600 at ~120KB each => + extra 5GB. Images will not be UPDATEd. They will be deleted eventually after several years.

Question

There are many ways to split a file into smaller chunks: you can split it into 4KB, 8KB, 64KB, etc. What would be the optimal way when using InnoDB storage engine to minimize wasted space first and overall performance second?

I found these docs: http://dev.mysql.com/doc/refman/5.6/en/innodb-file-space.html, but there is not much detail about BLOB. It says that page size is 16KB.

The maximum row length, except for variable-length columns (VARBINARY, VARCHAR, BLOB and TEXT), is slightly less than half of a database page. That is, the maximum row length is about 8000 bytes.

I really expected official documentation to be more precise than about 8000 bytes. The following paragraph is most interesting:

If a row is less than half a page long, all of it is stored locally within the page. If it exceeds half a page, variable-length columns are chosen for external off-page storage until the row fits within half a page. For a column chosen for off-page storage, InnoDB stores the first 768 bytes locally in the row, and the rest externally into overflow pages. Each such column has its own list of overflow pages. The 768-byte prefix is accompanied by a 20-byte value that stores the true length of the column and points into the overflow list where the rest of the value is stored.

Considering the above there can be at least these strategies:

  • choose such chunk size that it will be stored locally within the page without involving off-page storage.
  • choose such chunk size that the whole BLOB is stored off-page.
  • I don't like the idea of storing a BLOB partially within the page and partially off-page. But, hey, maybe I'm wrong.

I also came across this doc https://dev.mysql.com/doc/refman/5.6/en/innodb-row-format-dynamic.html and at this point I realised that I want to ask this question. It is too overwhelming for me now and I hope that there is somebody who has had a practical experience with this topic.

I don't want to end up wasting half of the disk space by inadvertently choosing a poor chunk size and row format. My concern is that if I choose to store 8000 bytes for each chunk plus 16 bytes for 4 ints in the same row of PhotoChunks table it would exceed that magic half of the page size and I end up spending 16KB for each row for only 8000 byte of data.

Is there a way to check how much space is actually wasted in this way? In the Amazon RDS environment I'm afraid there is no way to have a look at the actual files that the InnoDB table consists of. Otherwise, I would simply try different variants and see the final file size.

So far I can see that there are two parameters: the row format and chunk size. Maybe there are other things to consider.

Edit

Why I don't consider changing the max_allowed_packet variable. From the doc:

Both the client and the server have their own max_allowed_packet variable, so if you want to handle big packets, you must increase this variable both in the client and in the server.

I use MySQL C API to work with this database and the same C++ application is talking to 200 other MySQL servers (completely unrelated to this project) using same libmysql.dll. Some of these servers are still MySQL 3.23. So my app has to work with all of them. Frankly speaking, I didn't look into docs on how to change max_allowed_packet variable in the client side of MySQL C API.

Edit 2

@akostadinov pointed out that there is mysql_stmt_send_long_data() to send BLOB data to server in chunks and people said that they have managed to INSERT BLOBs that are larger than max_allowed_packet. Still, even if I manage to INSERT, say, 20MB BLOB with max_allowed_packet=4MB how do I SELECT it back? I don't see how I can do it.

I would appreciate it if you pointed me to the right direction.

Community
  • 1
  • 1
Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
  • Did you know that one can modify `max_allowed_packet` on Amazon RDS by using [DB Parameter Groups](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html)? – eggyal Apr 13 '15 at 08:57
  • not a duplicate as far as I see, sorry for flagging – akostadinov Apr 13 '15 at 09:12
  • I'm not sure why you say you'd "end up spending 16KB for each row". I can only imagine you think InnoDB is limited to one record per index page? That is absolutely not correct: each index page will be filled with multiple records, until it is unable to hold any more. Indeed, the requirement that a record's inline stored-data be less than half a page in length is to *ensure* that each page contains *at least* two records! – eggyal Apr 13 '15 at 09:31
  • I'm also not sure why you "don't like the idea of storing a BLOB partially within the page and partially off-page", since that is a performance enhancement (MySQL must always look into the page first in order to find the reference to the field's off-page storage, so encountering the first 768 bytes of data at the same time saves on a little I/O). However, if you *really* don't like it, neither the `COMPRESSED` nor `DYNAMIC` formats store a prefix in the index page. – eggyal Apr 13 '15 at 09:34
  • Having born all that in mind, trying to ensure your records stay within the index page only saves you ~20 bytes of storage per record (for the reference)—but then you'll need many more records, so that gain would likely be very quickly lost elsewhere. Better to store as much as possible in a single (off-page, if need be) field. – eggyal Apr 13 '15 at 09:40
  • Overall, it feels like you have forgotten Knuth's maxim "*premature optimisation is the root of all evil*" and are wasting effort on pointless micro-optimisations. – eggyal Apr 13 '15 at 09:41
  • @eggyal, 1) I added my conserns about changing `max_allowed_packet` to the question. 2) I listed 3 possible strategies that I can see. There can be more and I'm interested to know what is available/better/recommended. The docs I found are not very clear and I'm confused. 3) If the poor choice of few parameters can lead to wasting 50% of disk space, then I don't think it is a premature optimization. – Vladimir Baranov Apr 13 '15 at 10:53
  • @eggyal, Why 50%? If I have chunks, say, 8200 bytes in size, then definitely only one row can fit into 16KB page, so almost half of it will be wasted. Am I wrong here? Maybe. That's why I'm asking here. What if I use 7990 bytes? Will two rows definitely fit? Where is that threshold and how to find it out? I didn't find such details in docs. Maybe the whole idea of trying to store data in-row is flawed. That's why I'm asking here. – Vladimir Baranov Apr 13 '15 at 10:55
  • 1
    I already explained above that those assumptions are wrong. If you have chunks that are 8200 bytes in size then the BLOB will *always* be stored outside the index page—only the internal row overhead, remaining columns and 20-byte BLOB reference will be stored inline (also the 768-byte prefix if you're using one of those row formats); consequently *many* records will still fit in each page. If your chunks are small enough in size that two records can fit within an index page, then *by definition* you are achieving multiple records per page—but you're wasting a lot of overhead in the process. – eggyal Apr 13 '15 at 11:30
  • Thank you @eggyal, now I understand what you meant by premature optimization. You are saying that DB engine is smart enough to not waste much space in any case and playing with these parameters doesn't affect end result much. I can go with default settings for row format, but if I do split data in chunks I have to decide what size of the chunk to choose. You said "Better to store as much as possible in a single (off-page, if need be) field". This MySQL instance has `max_allowed_packet`=4MB, I've seen others with 1MB, so to play it safe the chunk size should be 512KB ? – Vladimir Baranov Apr 13 '15 at 11:50

2 Answers2

4

I stand by my answer in forums.mysql.com of 2 years ago. Some further notes:

  • 16M is likely to work for max_allowed_packet, however I have no evidence that it works beyond that.
  • In an application I worked on a several years ago, it seemed that a chunk size of about 50KB was 'optimal'.
  • max_allowed_packet can be set in /etc/my.cnf. But if you don't access to that, you are stuck with its value. You can get it in any(?) version by doing SHOW VARIABLES LIKE 'max_allowed_packet'. (I'm reasonably sure back to 4.0, but not sure about 3.23.) So that could be an upper limit on your chunk size.
  • InnoDB will split big BLOB/TEXT fields into 16KB blocks. Probably each block has some overhead, so you don't get exactly 16KB.
  • Antelope versus Barracuda, and other settings control whether 767 bytes of the BLOB is stored in the record. If none is stored there, there is a 20-byte pointer to the off-block storage.
  • Today, 16MB may seem like a reasonable limit for picture size; tomorrow it won't.
  • If you are running a new enough version of MySQL, innodb_page_size can be raised from 16K to 32K or 64K. (And the ~8000 goes up to ~16000, but not ~32000.)
  • If replication is involved, chunking becomes more important. But there can be some extra tricky business with the 'sequence number' for the chunks. (Ask me if you need to go this direction.)
  • Adding the above comments together, I suggest a chunk size of MIN(64700, max_allowed_packet) bytes as a reasonable compromise, even if you can't control innodb_file_format. Only 1-2% of disk space will be wasted inside this "photos" table (assuming pictures of, say, about 1MB).
  • Compression is useless; JPGs are already compressed.
  • Most of the time is in I/O; second most is in network chatter between client and server. The point here is... C vs PHP will not make much difference when it comes to performance.
  • The ~8000 bytes per record is irrelevant in this discussion. That applies to a table with lots of columns -- they can't add up to more than ~8K. Most of the BLOB will go off-page, leaving only 60-800 bytes per row, hence 15-200 rows per 16KB block (average, after other types of overhead).
  • PARTITION is unlikely to be of any use.
  • Is "chunking a premature optimization"? It is not an "optimization" if you are hitting a brick wall because of max_allowed_packet.
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thank you for your list. I will try to make my question more concise and highlight particular concerns that I had. – Vladimir Baranov Apr 18 '15 at 11:14
  • 2
    The 'answer of two years ago' was posted literally two years before this one. It is a collection of links to Ricks previous answers in the forum. Find it at https://forums.mysql.com/read.php?52,584053,584159#msg-584159 Also check out Ricks page http://mysql.rjweb.org/ There's a ton of mysql stuff to dwell on. His forums.mysql.com best of is impressive: http://mysql.rjweb.org/bestof.html Nice Job, Rick! – cachius Sep 07 '20 at 16:10
  • Thanks @rein . A few years ago, I moved away from mysql.forum. In doing so, I stopped maintaining "best of". MySQL does not change very fast, so _most_ of "best of" is still valid. However, there is nothing there about the substantive changes that come with 8.0. I _do maintain_ the rest of mysql.rjweb.com. I also edit my answers on stackoverflow and dba.stackexchange when I spot errors and out-of date info. – Rick James Sep 07 '20 at 16:26
2

One approach to try is using long send as described here: Is there any way to insert a large value in a mysql DB without changing max_allowed_packet?

Another approach, as you suggest, is to split data into chunks. See one possible approach in this thread: http://forums.mysql.com/read.php?20,601656,601656

Another is, given you set some image max size limit on your user interface, to increase packet size accordingly. Do you allow images larger than 16MB?

If you ask me, I'd avoid implementing chunking as it looks more like a premature optimization instead of letting DB do its own optimizations.

Community
  • 1
  • 1
akostadinov
  • 17,364
  • 6
  • 77
  • 85
  • Thank you. Even though the answer is for `php` it sound promising. I'm using C API and I never used prepared statements with MySQL before (never really needed for my projects), so I didn't look at the set of functions that are available for them. I'll definitely check it out. `stmt->send_long_data` may be the key. I also added a note to the question about `max_allowed_packet`. I'd rather not add artificial restrictions to the user, if possible (max 16mb image size). – Vladimir Baranov Apr 13 '15 at 10:58
  • @VladimirBaranov, restriction is not artificial. e.g. if you expect an avatar, then providing 16MB is unreasonably large. If user sends more data, then he is doing something wromg or trying to abuse the system. So it depends on use case. – akostadinov Apr 13 '15 at 13:03
  • My case is photos. Now they are ~5MB JPEG, which means that depending on the scene and compression quality the file size may grow. Next generation of cameras will have even more megapixels, so in few years 16MB may be not enough. Besides, what if they wanted to store RAW in addition to JPEG after the system goes into production and becomes a success? It is hard to predict such things. Also, there will be third-party apps that read data from this DB as well. That's why I'd prefer to have universal solution with chunks that will be guaranteed to work with any server and any client. – Vladimir Baranov Apr 14 '15 at 00:31
  • If I can't/don't want to/too hard to change `max_allowed_packet`, what options do I have? I looked at [`mysql_stmt_send_long_data()`](http://dev.mysql.com/doc/refman/5.6/en/mysql-stmt-send-long-data.html). 1) As of MySQL 5.6.3, the `max_allowed_packet` system variable controls the maximum size of parameter values that can be sent with `mysql_stmt_send_long_data()`. Oops. Looks like they closed the loop-hole. 2) Even if I manage to `INSERT`, say, 20MB BLOB with `max_allowed_packet`=4MB how do I `SELECT` it back? I don't see how I can do it. – Vladimir Baranov Apr 14 '15 at 00:38
  • @VladimirBaranov, this looks like what you are looking for (unless you go with chunking): http://stackoverflow.com/questions/8489338/how-to-retrieve-image-in-mysql-database-longblob also this might help you: http://stackoverflow.com/questions/8257941/retrieve-longblob-from-mysql-in-c-sharp ... if you do chunking then you don't need anything special. Just read all chunks and stream them to target. – akostadinov Apr 14 '15 at 14:16