0

I have some few services, like Catalog Service, Customer Service, Recommendations Service, Order Taking Service and so on ..., each service has its own Keyspace in a Cassandra database.

I have two questions:

1 - For a change in a service: should I first publish the change's event (or record) to Kafka and then consume it from that same service in other to update its database, or should I update its database first and then publish the record to Kafka ?

2 - How to choose which change to publish to Kafka, should I publish all updates to Kafka, even those without interest for others services, like "attribute X updated to Y for product Z" ?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
acmoune
  • 2,981
  • 3
  • 24
  • 41
  • Sounds like you might want to try out event storming (unrelated to the kafka specifics). Think about the events that happen in the system group them into logic pieces. Don't think in terms of services but rather what sequence of actions happens in response to what – OneCricketeer Mar 12 '20 at 14:21

1 Answers1

2

1) I would suggest you always try to read your writes. Which operation is more likely to succeed? A replicated ack from Kafka, or a durable Cassandra upsert? If you think Kafka is more durable, then you'd write it there then use a tool like Kafka Connect to write it down to Cassandra (assuming you really need Cassandra over a Global KTable, that's up for debate)

2) There's no straightforward answer. If you think data will ever be consumed in ways that might be relevant, then produce it. Think about it like an audit log of any and all events. If you want to build an idempotent system that always knows latest state of any product and all changes that happened, then you can either store the whole object each time as (id, product) pairs where you holistic update the entire product, or you can store each delta of what changed and rebuild state from that

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • OK, I got you for the first point. So for the second point, I understand that once Kafka is involved, I should try to avoid writing directly to a database, and prefer consuming everything from Kafka, even if the event if totally internal to the Service, because that log might be useful in the future, for stuffs like rebuilding a consumer for example, right ? – acmoune Mar 12 '20 at 16:13
  • Seems you got it. Your service could have producers and consumers on separate threads, even (prefferably via Kafka Streams API). The only downsides would be the network round trip compared to an in-memory call – OneCricketeer Mar 12 '20 at 18:56
  • @criket_007, I have another concern, related I think: Is there a way to categorize topics in a Kafka cluster ? for example a way to say that: Those topics are about `Orders Tracking`, or should I rely on naming conventions ? I am still looking for it in the docs. – acmoune Mar 13 '20 at 13:41
  • Naming convention is the wild west, unfortunately. Topics can be up to 255 characters, I think, so prefixing in `kebab-case` is the most popular standard I've seen. If you mix underscores and periods, then metrics can be harder to gather – OneCricketeer Mar 13 '20 at 14:34