Flink for Stateless processing

Question

I am new to flink and our use case deals with Stateless computation. Read event, process event and persist into Database. But Flink documentation never speaks about stateless processing. Any example repository to find stateless examples or documentation.

Finally For this use case which Flink model works? streaming application or event driven application.

David Anderson · Answer 1 · 2020-12-09T13:50:55.730

There's a strong emphasis in the docs on stateful stream processing, because the community is proud of having created a highly performant, fault tolerant stream processor that provides exactly-once guarantees even when operating stateful pipelines at large scale. But you can certainly make good use Flink without using state.

However, truly stateless applications are rare. Even in cases where the application doesn't do anything obviously stateful (such as windowing or pattern matching), state is required to deliver exactly-once, fault-tolerant semantics. Flink can ensure that each incoming event is persisted exactly once into the sink(s), but doing so requires that Flink's sources and sinks keep state, and that state must be checkpointed (and recovered after failures). This is all handled transparently, except that you need to enable and configure checkpointing, assuming you care about exactly-once guarantees.

The Flink docs include an tutorial on Data Pipelines and ETL that includes some examples and an exercise (https://github.com/apache/flink-training/tree/master/ride-cleansing) that are stateless.

Flink has three primary APIs:

DataStream API: this low-level API is very capable, but the other APIs listed here have strong advantages for some use cases. The tutorials in the docs make for a good starting point. See also https://training.ververica.com/.
Flink SQL and the Table API: this is especially well suited for ETL and analytics workloads. https://github.com/ververica/sql-training is a good starting point.
Stateful Functions API: this API offers a different set of abstractions, and a cloud-native, language-agnostic runtime that supports a variety of SDKs. This is a good choice for event-driven applications. https://flink.apache.org/stateful-functions.html and https://github.com/ververica/flink-statefun-workshop are good starting points.

Thankyou @David Anderson. Where can I get Flink Streaming vs Flink Event-Driven application details? Our use case can not be satisfied with SQLs. It has heavy computation involved. Any suggestions will be realy helpful. — Forece85, Dec 09 '20 at 10:41
Without knowing more about the requirements of your use case, I don't know what to suggest. — David Anderson, Dec 09 '20 at 13:26
For what it's worth, I don't see why "heavy computation" rules out Flink SQL. Pipelines built with SQL can be very performant. — David Anderson, Dec 09 '20 at 13:54
Above Data Pipelines and ETL link is not working anymore, use this one https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/learn-flink/etl/ — Muhammad Faizan Fareed, Jul 24 '21 at 04:52

Flink for Stateless processing

1 Answers1