2

I am using Flink 1.4.1 to process transactional events and HDFS to store the checkpoint information for fault tolerance.

A job was created to aggregate information about clients, day of week and hour of day, thus creating a profile as shown in the code below.

val stream = env.addSource(consumer)
val result = stream
  .map(openTransaction => {
    val transactionDate = openTransaction.get("transactionDate")
    val date = if (transactionDate.isTextual)
      LocalDateTime.parse(transactionDate.asText, DateTimeFormatter.ISO_DATE_TIME).toInstant(ZoneOffset.UTC).toEpochMilli
    else
      transactionDate.asLong
    (openTransaction.get("clientId").asLong, openTransaction.get("amount").asDouble, new Timestamp(date))
  })
  .keyBy(0)
  .window(SlidingEventWeekTimeWindows.of(Time.days(28), Time.hours(1)))
  .sum(1)

In the code above, the stream has three fields: "transactionDate", "clientId" and "amount". We make a keyed stream by the clientId and a sliding window summing the amount. There are around 100.000 unique active clientIds in our database.

After some time running, the total RAM used by the job is stabilized at 36 GB, but the stored checkpoint in HDFS uses only 3 GB. Is there a way to reduce the RAM usage of the job, maybe by configuring Flink's replication factor or by using RocksDB?

1 Answers1

1

Using RocksDB is absolutely something you should consider for this state size and, depending on the usage patterns, can have much smaller checkpoints as it performs it incrementally by only copying the SSTs that are new or updated.

Some things to know, keep in mind:

  • Each stateful operator parallel sub-task will have its own RocksDB instance.
  • If you do switch to RocksDB for the checkpointing and it begins running slower than you need it to, make sure that the serialization you are using is as efficient as possible.
  • Flink provides some PredefinedOptions based on your backing file system, make sure you choose that appropriately
  • If the pre-defined options don't work for you, you can override the OptionsFactory for the RocksDB backend and fine tune the individual RocksDB options

Another thing to note about memory usage in Flink with keyed time windows is that the "timers" can use up a significant amount of memory if you are going into the hundreds of thousands or millions. Flink timers are heap-based (as of this writing) and are synchronously checkpointed independently of your state backend.

Joshua DeWald
  • 3,079
  • 20
  • 16
  • We already tried using RocksDb with the lines below, but it didn't affect memory usage. We expected it would store state in disk when state > memory but it seems it is only affecting checkpoint storage. `val env = StreamExecutionEnvironment.getExecutionEnvironment()` `env.setStateBackend(new RocksDBStateBackend(filebackend, true))` Is there a configuration we are missing? – Gabriel Pelielo May 03 '18 at 20:05
  • Added a note about timers that might be relevant to you. – Joshua DeWald May 04 '18 at 16:00