Flink batch program output accumulator doesn't work

Question

    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    ......
    JobExecutionResult jobExecutionResult = env.execute(XXXOffline.class.getName());
    int records = jobExecutionResult.<Integer>getAccumulatorResult("counter");
    LOGGER.info("total records: {}", records);

However the log was not written to the log file(ps: other log output works well). I think env.execute() is blocking call and when all subtasks are over the getAccumulatorResult() executes. I don't know why the last line log output doesn't work.

score 1 · Answer 1 · answered Apr 28 '20 at 06:29

1

From the docs, you can see that

Accumulators are simple constructs with an add operation and a final accumulated result, which is available after the job ended.

So as you have figured out, there is no way to access the accumulator before the job terminated (e.g., env#execute returns). They can be used to orchestrate smaller (bounded) jobs. I often use it for integration tests to formulate assertions.

For unbounded jobs, they have no clear benefit. You want to use metrics instead.

answered Apr 28 '20 at 06:29

Arvid Heise

3,524
5
11

My program is a dataset batch program. After env.execute() returned I should get accumulator result and log it. But actually it's not. – Qoobee Apr 28 '20 at 11:52
1

There are several examples in terms of (tests)[https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/accumulators/AccumulatorITCase.java] in Flink, which you can use to double-check your solution. But the log line should still occur. I rather suspect that you misconfiured your logger. You could also use a debugger to check what's really going on. – Arvid Heise Apr 29 '20 at 13:29
Thx. It's probably the problem of the flink cluster. Program started from local IDE works as expected. – Qoobee May 07 '20 at 02:57

Flink batch program output accumulator doesn't work

1 Answers1