0

We are working on deriving the status of accounts based on the activity on it. We calculate and keep the expiryOn date(which says the tentative, future date on which account expires) based on the user activity on the account.

We have a manual date change event which gives a date based on which the status of the account is emitted as Expired.

I would like to know on what would be the best way to achieve this. So, my question is since the date change event occurs in future when compared to the calculation of the expiryOn date, can the broadcasted state be a solution for this? If yes, please suggest the way. Or, is there any better approaches like Table API to solve this problem?

Kranthi
  • 37
  • 6

1 Answers1

1

Broadcast state is suitable in cases (like this one) where you need to either share information or invoke actions that aren't keyed, and so cannot be sent to one relevant instance.

If you need to store the broadcast state, keep in mind that each instance will store a copy of the broadcast state on the heap, and include that copy in its checkpoints.

If you are using context.applytokeyedstate, be careful to make changes to the keyed state that are deterministic -- otherwise, in the event of a failure and recovery at a point where some instances of the broadcast operator have applied the changes to keyed state, and other instances have not, you could end up with inconsistencies.

David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • Hello David, thanks for the response. However, the stream that contains the date(current date) is not specific to any account and so we do not get any account id to be keyed on. – Kranthi Jul 20 '21 at 03:56
  • Then broadcast is the way to go. – David Anderson Jul 20 '21 at 11:17
  • Ok. I am trying to use `context.applytokeyedstate()` method in the `processBroadcastElement` part of the `KeyedBroadcastProcessFunction` and could get it working. Since it broadcasts the date to all the operator(account) states, can I know its implication(overhead if any) on the flink network as I would have accounts nearing to a few millions. – Kranthi Jul 20 '21 at 14:40
  • You'll be serializing whatever gets broadcast, and sending a copy to every instance of your KeyedBroadcastProcessFunction. Normally that shouldn't be a big deal. (FYI, I've rewritten and expanded my answer.) – David Anderson Jul 20 '21 at 15:27
  • Thanks for expanding your answer and I have gone through it. I understand that there might be inconsistency problems during recovery if I try to update the keyed state inside `context.applytokeyedstate()`. But in my case, I am only accessing the state to check if the date matches and there is no need to update the keyed state which I hope shouldn't get into the problem you mentioned. Please confirm. Also, let me know if you can foresee any other issues. – Kranthi Jul 21 '21 at 03:45
  • I don’t foresee any problems. – David Anderson Jul 21 '21 at 12:32