I am currently wondering how to handle application errors in Apache Flink streaming applications. In general, I see two cases:
- Transient errors, where you want the input data to be replayed and processing might succeed on second try. An example would be a dependency on an external service, which is temporarily unavailable.
- Permanent errors, where repeated processing will still fail; for example invalid input data.
For the first case, it looks like the common solution is to just throw some exception. Or is there a better way, e.g. a special kind of exception for more efficient handling such as FailedException
from Apache Storm Trident (see Error handling in Storm Trident topologies).
For permanent errors, I couldn't find any information online. A map()
operation, for example, always has to return something so one cannot just silently drop messages as you would in Trident.
What are the available APIs or best practices? Thanks for your help.