0
  • Currently, flink application is configured and implemented to create avro files on every checkpoint.

  • Is is possible to force the flink application to create avro file on-demand, instead of configurable time interval.

  • Is there any REST APIs or any other java implementation and configuration to force checkpoint.

Environment

  • Flink version - 1.15.4
  • Jdk 8
Hareesh
  • 41
  • 4

1 Answers1

1

Is is possible to force the flink application to create avro file on-demand, instead of configurable time interval.

Assuming you are using StreamingFileSink.forBulkFormat to produce in avro format, you can implement custom CheckpointRollingPolicy and checkpoint on processing time or on a specific event:

public final class CustomCheckpointRollingPolicy<IN, BucketID>
        extends CheckpointRollingPolicy<IN, BucketID> {

    private static final long serialVersionUID = 1L;

    @Override
    public boolean shouldRollOnEvent(PartFileInfo<BucketID> partFileState, IN element) {
        return false;
    }

    @Override
    public boolean shouldRollOnProcessingTime(
            PartFileInfo<BucketID> partFileState, long currentTime) {
        return false;
    }
}
...
StreamingFileSink
      .forBulkFormat(outputBasePath, AvroWriters.forGenericRecord(schema))
      .withRollingPolicy(new CustomCheckpointRollingPolicy())
      .build()

Is there any REST APIs or any other java implementation and configuration to force checkpoint.

Yes, you can trigger savepoint without cancel, which will trigger checkpoint. The corresponding REST API endpoint is /jobs/:jobid/savepoints. See REST API #jobs-jobid-savepoints section for details

UPD: It's possible to trigger checkpoint via a dedicated /jobs/:jobid/checkpoints POST endpoint. See REST API #jobs-jobid-checkpoints-1 section for details.

Mikalai Lushchytski
  • 1,563
  • 1
  • 9
  • 18
  • I Have tired **/jobs/:jobid/savepoints REST API** by passing ` { "cancel-job" : false, "target-directory" :"file://temp" } ` and it got successfuly gerneated respond {"request-id":"112"} **The Avro file not generated in the target-directory.** **How can we know the Avro file was generated at that tiggered time??** **How will we know the checkpoint trigger successfully and generated AVro???** ** Please suggest how to fix this issue** – Hareesh Jun 15 '23 at 15:49
  • The **/jobs/:jobid/savepoints REST API** will generates only _medata data file? The **target-directlory ** generated only the _metadata file not the Avro. My requirement is- is it possilble to force the Avro to create on-demand? – Hareesh Jun 16 '23 at 11:25
  • The */jobs/:jobid/checkpoints POST* will trigger a checkpoint - see https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints-1. Sorry, I was under impression that the `savepoint` should also roll out new file. – Mikalai Lushchytski Jun 16 '23 at 12:01
  • /jobs/:jobid/checkpoints POST not found in Flink version - 1.15.4 only GET method. Can you suggest is there any other way to force the Avro to generate on-demand? – Hareesh Jun 16 '23 at 17:20