Is increasing `max_hint_window_in_ms` to days a bad idea?

Question

I'm considering raising max_hint_window_in_ms to something like 72 hours. Anyone see issues with this? Essentially, it would allow us much longer downtime of nodes over a weekend without having to do a full repair.

Assume you'll have to consider your write / live data ratio as well as the absolut value of live data, to decide whether accumulating hints for longer period and then processing such would be lighter than doing a full repair consider repairs are recommended for cluster health before gc_grace anyway (default 10 days). — Steffen Winther Sørensen, Feb 20 '17 at 17:09

score 1 · Accepted Answer · answered Feb 20 '17 at 19:24

It depends on the version. After C* 3.0 or DSE 5.0 when the hinted handoff storage was refactored its actually a very good idea to increase it. Before then (given your 2.1 tag assuming this is you) theres a lot of issues with accumulating too many hints highlighted in this blog post. Unless using a version after 3.0 I would not recommend increasing it too much.

To highlight some pre 3.0 issues:

Hints are stored in a C* table and acts like a queue which is a known antipattern, builds up many tombstones and slow/expensive reads
Hints are partitioned by node, so if one node is down a long time the partition gets very huge. This is handled better in the latest of C*/DSE but particularly in 2.1 this impacts compactions, and gcs significantly.
Compactions are called regularly and are required, but if there is nothing getting removed this means just rewriting the mutations over and over while the node is down (wasteful)
Individual mutations need to go through memtables and full write path vs just appended to disk

Is increasing `max_hint_window_in_ms` to days a bad idea?

1 Answers1