Classic problem: I have many many client applications (native and web apps) connected to a system, all of which need to get updates in real-time as information changes. Most of the information changes rarely. When it does, the client systems need to know this very quickly. The backend system uses MariaDB in a Galera cluster with InnoDB as the storage engine.
I am aware that you can use the INFORMATION_SCHEMA for this sort of problem, but it is non-deterministic when using INNODB tables - especially in a Galera cluster environment. It is possible that if I rely upon the UPDATE_TIME of the tables in question it will be slightly out of date because of in-memory buffering. It is also possible that different servers in the cluster will have different values for this time at any given moment. Is this a big deal? I doubt it. But I was looking for a solution that would give me accurate data and that was performant.
Each table in question has an 'updated_at' column with an index. I have tried querying each table individually (e.g., select updated_at from TABLE order by updated_at limit 1
), but this is very expensive when there are a lot of tables to check. A stored procedure makes this somewhat more efficient, but it still feels inelegant.
For example, would triggers make sense here? The tables are updated relatively rarely when compared to the read activity on the tables. Could I set up an insert/delete/update trigger on each table that updated a corresponding last_updated time in ANOTHER table? Then just monitor that table to decide if data I cared about has changed? Should I use some sort of pub/sub thing to signal 'out of band' that data has changed? Then have things that care subscribe to that?
This has to be a solved problem. Does anyone have a great solution to suggest?