0

I have a table Metadata.

I want the contents of the table in my Flink application. So I want to read all entries in the table and save to MapState<Metadata::Id, Metadata>.

If my application restarts, I do not want to read from table, instead, I will read from MapState<Metadata::Id, Metadata> and use it.

Is there a way I can achieve this ?

Logic
  • 2,230
  • 2
  • 24
  • 41

1 Answers1

1

The youtube video and github repo I linked to in this answer cover a number of similar scenarios. But the best way to bootstrap Flink state is to preload the data into a savepoint using the State Processor API.

Keep in mind that Flink's MapState is a kind of key-partitioned state. So if you use MapState<Metadata::Id, Metadata>, that is effectively a Map<KEY, MapState<Metadata::Id, Metadata>> that is sharded across the cluster by KEY.

Here's an example showing how to create a savepoint containing a ValueState<Integer>:

public class Bootstrap {
    public static void main( String[] args ) throws Exception {
        ExecutionEnvironment bEnv =
                ExecutionEnvironment.getExecutionEnvironment();

        BootstrapTransformation<Integer> transform =
                OperatorTransformation.bootstrapWith(bEnv.fromElements(1, 2, 3))
                        .keyBy(String::valueOf)
                        .transform(new SimplestTransform());

        Savepoint
                .create(new FsStateBackend("file:///tmp/checkpoints"), 256)
                .withOperator("my-operator-uid", transform)
                .write("file:///tmp/savepoints/");

        bEnv.execute();
    }

    static public class SimplestTransform
            extends KeyedStateBootstrapFunction<String, Integer> {
        ValueState<Integer> state;

        @Override
        public void open(Configuration parameters) {
            ValueStateDescriptor<Integer> descriptor = new
                    ValueStateDescriptor<>("total", Types.INT);
            state = getRuntimeContext().getState(descriptor);
        }

        @Override
        public void processElement(Integer value, Context ctx) throws Exception {
            state.update(value);
        }
    }
}

This creates a sharded key/value map containing {"1": 1, "2": 2, "3": 3}.

David Anderson
  • 39,434
  • 4
  • 33
  • 60