I'm writing a bit of software that needs to flatten data from a hierarchical type of format into tabular format. Instead of doing it all in a programming language every time and serving it up, I want to cache the results for a few seconds, and use SQL to sort and filter. When in use, we're talking 400,000 writes and 1 or 2 reads over the course of those few seconds.
Each table will contain 3 to 15 columns. Each row will contain from 100 bytes to 2,000 bytes of data, although it's possible that in some cases, some rows may get up to 15,000 bytes. I can clip data if necessary to keep things sane.
The main options I'm considering are:
MySQL's Memory engine
A good option, almost specifically written for my use case! But.. "MEMORY tables use a fixed-length row-storage format. Variable-length types such as VARCHAR are stored using a fixed length. MEMORY tables cannot contain BLOB or TEXT columns." - Unfortunately, I do have text fields with a length up to maybe 10,000 characters - and even that is a number that is not specifically limited. I could adjust the varchar length based on the max length of text columns as I loop through doing my flattening, but that's not totally elegant. Also, for my occasional 15,000 character row, does that mean I need to allocate 15,000 characters for every row in the database? If there was 100,000 rows, that's 1.3 gb not including overhead!
InnoDB on RAMDisk
This is meant to run on the cloud, and I could easily spin up a server with 16gb of ram, configure MySQL to write to tmpfs and use full featured MySQL. My concern for this is space. While I'm sure engineers have written the memory engine to prevent consuming all temp storage and crashing the server, I doubt this solution would know when to stop. How much actual space will my 2,000 bytes of data consume when in database format? How can I monitor it?
Bonus Questions
Indexes I will in fact know in advance which columns need to be filtered and sorted by. I could set up an index before I do inserts, but what kind of performance gain could I honestly expect on top of a ram disk? How much extra overhead to indexes add?
Inserts I'm assuming inserting multiple rows with one query is faster. But the one query, or series of large queries are stored in memory, and we're writing to memory, so if I did that I'd momentarily need double the memory. So then we talk about doing one or two or a hundred at a time, and having to wait for that to complete before processing more.. InnoDB doesn't lock the table but I worry about sending two queries too close to each other and confusing MySQL. Is this a valid concern? With the MEMORY engine I'd have to definitely wait for completion, due to table locks.
Temporary Are there any benefits to temporary tables other than the fact that they're deleted when the db connection closes?