0

Currently learning about database internals, and it seems like there's a lot of interesting tradeoffs when you are designing a data management system. Some of these tradeoffs seem to be dependent on what kind of behavior you think will occur once user's begin to use your database.

For example, you may design your database to have super fast reads for certain types of data but maybe it makes writes slow. So you're assuming that your user is mostly only going to care about the speed of reads.

One of the takeaways I've gotten so far is that, assumptions like this are dangerous because requirements change, can't predict user behavior, etc. So it got me thinking, are there any databases out there that use the databases internal statistics, do some cost analysis, and change their internal file structure/data structures/algorithms to something that better suits the users?

English is not my first language so I apologize if there's any bad grammar, happy to elaborate in comments if it doesn't make sense :)

  • "have super fast reads for certain types of data but maybe it makes writes slow" - if you mean adding indexes to existing data, that's exactly how RDBMS indexes work (though their impact on write performance is overstated, in my opinion). Many PaaS database services (like Azure SQL) will automatically create and delete indexes based on query performance - which is sort-of what you're describing. – Dai May 02 '20 at 01:21
  • But I'm not aware of any systems that will swap-out fundamentally different on-disk data structures for representing data _automatically_) - many RDBMS will let you use different on-disk structures (e.g. Columnar vs. Row-based table storage which can benefit OLAP but make OLTP more expensive) - or how MySQL will let you use different storage-engines for performance vs. resiliancy (MyISAM vs InnoDB, etc) - but these all require manual configuration and sometimes require actual downtime too. – Dai May 02 '20 at 01:24
  • Thanks for the insight! I think my example, may not have been the most appropriate. I think the idea of column vs row-based table storage is a much better representation of what I was trying to ask. Actually swapping between column and row-based table storage based on user query statistics, may be more trouble than its worth if it requires some downtime. – questions May 02 '20 at 01:43
  • SQL Server implements columnar-storage by duplicating row-based data in a columnar-format, which means there's no downtime, but does mean you end up with double the storage space requirements. – Dai May 02 '20 at 01:45

0 Answers0