I've a Spring boot based KStreams application where I am joining data across multiple topics. What is/are best practice(s) to handle a situation when there is a delay in one topic? I've read links such as How to manage Kafka KStream to Kstream windowed join? and others.
Here is my sample code(Spring Boot App) to produce mock data to 2 topics - Employee and Finance. Code for employee topic below:
private void sendEmpData() {
IntStream.range(0, 1).forEach(index -> {
EmployeeKey key = new EmployeeKey();
key.setEmployeeId(1);
Employee employee = new Employee();
employee.setDepartmentId(1000);
employee.setEmployeeFirstName("John);
employee.setEmployeeId(1);
employee.setEmployeeLastName("Doe");
kafkaTemplateForEmp.send(EMP_TOPIC, key, employee);
});
}
Likewise for the finance topic:
private void sendFinanceData() {
IntStream.range(0, 1).forEach(index -> {
FinanceKey key = new FinanceKey();
key.setEmployeeId(1);
key.setDepartmentId(1000);
Finance finance = new Finance();
finance.setDepartmentId(1000);
finance.setEmployeeId(1);
finance.setSalary(2000);
kafkaTemplateForFinance.send(FINANCE_TOPIC, key, finance);
});
}
The timestamp type associated with these records is TimeStampType.CREATE_TIME which I am assuming to be the same as event time in Streams.
I've a simple KStreams app which rekeys the finance topic to have the finance stream key match to employee stream key and then do the join as below:
employeeKStream.join(reKeyedStream,
(employee, finance) -> new EmployeeFinance(employee.getEmployeeId(),
employee.getEmployeeFirstName(),
employee.getEmployeeLastName(),
employee.getDepartmentId(),
finance.getSalary(),
finance.getSalaryGrade()),
JoinWindows.of(windowRetentionTimeMs), //30 seconds
Joined.with(
employeeKeySerde,
employeeSerde,
financeSerde)).to(outputTopic, Produced.with(employeeKeySerde, employeeFinanceSerde));
If a record with matching key arrives more than 30 seconds later in finance topic, then the join doesn't happen. Any insights on how to address this would be helpful. Thank you in advance.
P.S.: This data is a work of fiction. If it matches your department Id/salary, its merely coincidental. :)