How to restart flink from a savepoint from within the code

Question

I have a java class that is submitting a sql files to flink cluster.

I have

StreamExecutionEnvironment streamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment();


streamExecutionEnvironment.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.AT_LEAST_ONCE);
            streamExecutionEnvironment.getCheckpointConfig().setMaxConcurrentCheckpoints(1);
            streamExecutionEnvironment.getCheckpointConfig().setExternalizedCheckpointCleanup(
                    CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);

streamExecutionEnvironment.enableCheckpointing(5000, CheckpointingMode.AT_LEAST_ONCE);
        streamExecutionEnvironment.getCheckpointConfig().setCheckpointStorage(customParams.get("checkpoint_path"));

Configuration config = new Configuration();
config = new Configuration();
config.set(ExecutionCheckpointingOptions.ENABLE_CHECKPOINTS_AFTER_TASKS_FINISH, true);
config.set(PipelineOptions.NAME, customParams.get("pipeline_name"));

if (restartFromSavepointPath != null) {
    config.set(SAVEPOINT_PATH, restartFromSavepointPath);
}

streamExecutionEnvironment.setStateBackend(new EmbeddedRocksDBStateBackend(true));
streamExecutionEnvironment.configure(config);
...
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(streamExecutionEnvironment);

tableEnv.executeSql("create table ....");

// this is end of the main class

For getting restartFromSavepointPath I have some code that can get the latest checkpoint location, I can see the value as file:///tmp/flink-checkpoint-directory-domain/a98c68e3139041bc32e6a931e1f701e1/chk-24/_metadata

When i package the above code as fat jar and run it, the job does NOT start from the above checkpoint. command to start is flink run -c com.some.Deployer /some/local/location/some.jar --> How to i get this to start from the savepoint (given the execution.savepoint.path is set via config.set(SAVEPOINT_PATH, restartFromSavepointPath);) ?

But if i use -s option for flink run -c com.some.Deployer -s file:///tmp/flink-checkpoint-directory-domain/a98c68e3139041bc32e6a931e1f701e1/chk-24/_metadata /some/local/location/some.jar --> this one start the job from the savepoint.

score 0 · Answer 1 · answered May 09 '23 at 05:54

You can set "execution.savepoint.path" in flink-conf.yaml to the latest path and it would take care. Also you can set "execution.savepoint.ignore-unclaimed-state" set to "false", so that it will not start if it is not restored from specified savepoint, if that is set to true, it will try to restore from savepoint, if failed it will start normally. If you are using flink kubernetes operator, you can set "initialSavepointPath" in FlinkDeployment yaml in job spec as below:

  job:
jarURI: {{ .Values.jarName }} 
parallelism: {{ .Values.parallelism }}
entryClass: {{ .Values.entryClass }}
initialSavepointPath: {{ .Values.restorePath }}

you can replace values.restorePath with your savepoint location.

May I know how you are getting latest savepoint path, programatically

score 0 · Answer 2 · answered May 24 '23 at 18:14

0

I ended up using config.set(SAVEPOINT_PATH, checkpointPath); where config is Configuration and checkpointPath is path to the latest _metadata file in aws s3

answered May 24 '23 at 18:14

user3822232

141
1
8

score -1 · Answer 3 · answered Mar 09 '23 at 10:21

-1

If your want to use savepoint, when you stop or cancel job you must set savepoint; then you can run with savepoint.

bin/flink stop --type [native/canonical] --savepointPath [:targetDirectory] :jobId

answered Mar 09 '23 at 10:21

007

1
2

1

I understand that u can use `--savepointPath` with flink command. But i want an option for setting this within my java program, using `config.set(SAVEPOINT_PATH, restartFromSavepointPath);` Here `SAVEPOINT_PATH` value is `execution.savepoint.path`. Is this possible. My job is running multiple `insert into ...` statement, each of this is a separate job in flink, each of it needs to start from a different savepoint. – user3822232 Mar 09 '23 at 15:43

How to restart flink from a savepoint from within the code

3 Answers3