This question is not on how to solve issues with replication, the intention is to find bugs caused by slow replication. For performance, we do not want all queries to be synchronous, just queries we identify as critical read.
We have sometimes bugs concerning synchronization on our galera cluster. For example, our web application does a redirect after writing data but shows an outdated state on the next page. On the development environment we do not have these problems. On production with some server load, another node is sometimes not synchronized if we read the data written a few milliseconds before.
To solve this, we use node pinning for critical reads, to read from the same node as written before and we are experimenting with SET SESSION wsrep-sync-wait=6;
for INSERT/UPDATES/DELETE like described here to avoid reduce that behavior (and now with the bit "1" like rick-james mentioned).
How to test for bugs caused by slow replication?
Our idea is to simulate a very slow synchronization to test our application for critical read behavior. Is there some config option to let a galera cluster act like under heavy load? Galera has a built-in flow control to slow down, but I could not find a reliable way to force a cluster into flow control. The solution does not have to rely on MySQL alone, a slow virtual volume combined with something like "innodb_flush_method" might be helpful, too.
(Updated to hopefully improve the question)