0

I plan to use Flink on AWS Kinesis Analytics for Java Applications to perform stateful streaming aggregation.

I'd like to save checkpoints to a persistent store. What are my options?

  • Can I use S3 using FsStateBackend?

  • What about RocksSB? Is RocksDB offered our of the box by AWS Kineses Analytics for Java Applications?

Thanks!

user544192
  • 695
  • 2
  • 10
  • 23

2 Answers2

1

Regarding Flink checkpointing in Kinesis Data Analytics for Java Applications, this article shows how to configure checkpointing to S3 bucket. Seems like S3 is the persistent store recommended by AWS.

You can see that FsStateBackend supports S3 in Flink's official docs.

Michael
  • 3,206
  • 5
  • 26
  • 44
  • Thanks for the reply, Michael. Do you know if AWS also supports checkpointing to RocksDB? – user544192 Jan 03 '20 at 00:59
  • I'm sure you can checkpoint to RocksDB on AWS. However, I don't know if RocksDB is offered out of the box by AWS Kineses Analytics for Java Applications. Why not S3? It seems well-documented and will probably cost less. – Michael Jan 21 '20 at 07:34
  • 3
    According to the latest [Amazon Kinesis Data Analytics Developer Guide](https://docs.aws.amazon.com/kinesisanalytics/latest/java/reference-flink-settings.title.html) RocksDB is the default state backend and setting a different backend with `setStateBackend` has no effect. I assume RocksDB saves by default on S3 but this is hidden from the user. – Nicus May 19 '20 at 12:27
0

Yes, you can use FsStateBackend to store Flink's checkpoints in the s3 bucket which is recommended in Flink's official document.

Please find a detailed working answer which I have provided here.

Keshav Lodhi
  • 2,641
  • 2
  • 17
  • 23