0

Before starting, what I mean by large is GBs and medium-term storage is hours. We have a Flink running on AWS Kinesis Data Analytics for Flink Applications (KDA), which uses RockDB State Backend by default. Each KPU in KDA (kind of like task manager) has 50GB RockDB storage. Incremental State is enabled.

Our app is reading from all customers' events from Kinesis and sending to various destinations. When one destination becomes inaccessible, instead of stopping the whole processing, we want to store the events for that destination into Flink State for resending them later on. To avoid out of memory in Flink, we use RocksDBListState to store a list of keys, while each key points to an element in RocksDBMapState with the values of a list of events. In this way, we can serialize and deserialize a small subset of pending events at a time and move them from RocksDB into memory to avoid the "Out Of Memory" error. All of the above states are "keyed by states" for each destination.

My question here is that if this is the right approach for this kind of problem. Will this kind of large state have a significant performance impact? Is there any maintenance pitfall for this? I didn't find any similar usage and discussion regarding this. Any suggestions are welcomed.

Thanks!

btiernay
  • 7,873
  • 5
  • 42
  • 48
George
  • 1
  • 1

1 Answers1

0

I think you should be able to get something like what you propose to work. I wonder, though, if you really need the list of keys. MapState provides an iterator for iterating over the keys, and with RocksDB that iterator is guaranteed to iterate in order over the serialized keys. Perhaps that's all you need?

You can expect, of course, to end up with large checkpoints, which can be something of an operational nuisance -- though at the scale of gigabytes, it shouldn't be too bad.

A possibly simpler alternative would be to deploy a separate job for every destination, and let a job fail when its destination is unavailable, and later recover.

David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • Thanks for the comments. It gives us more confidence. By the way, each of our customers has a different destination, given that we have hundreds of thousands of customers. So one job for each destination might be hard to maintain. Maybe one job for sending to all destinations but read from a different data stores instead of Kinesis? Kinesis seems bad for this kind of usage, I think. – George Aug 10 '21 at 17:06
  • I thought that the different destinations were different failure domains, e.g., postgres vs kinesis vs s3. – David Anderson Aug 10 '21 at 17:51