Flink supports two different approaches to rescaling: active and reactive.
Reactive mode is new in Flink 1.13 (released just this week), and works as you expected: add (or remove) a task manager, and your application will adjust to the new parallelism. You can read about elastic scaling and reactive mode in the docs.
Reactive mode is currently a work in progress, but might need your needs.
In broad strokes, for active mode rescaling you need to:
- Do a stop with savepoint to bring down your current job while taking a snapshot of its state.
- Relaunch with the new parallelism, using the savepoint as the starting point.
The exact details depend on how your cluster is deployed.
For a step-by-step tutorial, see Upgrading & Rescaling a Job in the Flink Operations Playground.
The above applies to rescaling statefun embedded functions. Being stateless, remote functions can be rescaled more straightforwardly.