I was watching one of Robin Moffatt's videos (https://rmoff.net/2020/06/17/loading-csv-data-into-kafka/) and believe Apache Kafka might help me automate a workflow I have.
I have a requirment where I need to ingest a CSV from a customer, send off a subset of the orignal infromation to 2 vendors in various formats (text or csv), receive data back from those vendors, and then merge all of the data.
I'm somewhat of a nube to Kafka but was thinking I'd have a process as follows:
Ingest data from customer into kafka and save to either a SQL Server or Postgres database. I will then publish 2 "we have data" streams. Each stream would essentially have a single row that represents the batch we received from the customer. These streams which are topics will be consumed by a kafkaJS consumer. Using information in the message, these consumers will essentially select data out of the database based on the output required for that vendor.
At this point in the process we are expecting 2 responses. As each response comes in (SFTP) we will ingest the response file (JSON or CSV) into the db like we did with the orignal customer information. If we have received all of the data we will publish another message which will be consumed by the consumer which merges all of the data.
Do any of the Kafka ninjas like Robin have any suggestions? Much appreciated.
GD