3

The company I work for is investigating moving from our current monolithic API to microservices. Our current API is heavily dependent on spring and we use SQL server for most persistence. Our microservice investigation is leaning toward spring-cloud, spring-cloud-stream, kafka, and polyglot persistence (isolated database per microservice).

I have a question about how messaging via kafka is typically done in a microservice architecture. We're planning to have a coordination layer between the set of microservices and our client applications, which will coordinate activities across different microservices and isolate clients from changes to microservice APIs. Most of the stuff we've read about using spring-cloud-stream and kafka indicate that we should use streams at the coordination layer (source) for resource change operations (inserts, updates, deletes), with the microservice being one consumer of the messages.

Where I've been having trouble with this is inserts. We make heavy use of database-assigned identifiers (identity columns/auto-increment columns/sequences/surrogate keys), and they're usually assigned as part of a post request and returned to the caller. The coordination layer may be saving multiple things using different microservices and often needs the assigned identifier from one insert before it can move on to the next operation. Using messaging between the coordination layer and microservices for inserts makes it so the coordination layer can't get a response from the insert operation, so it can't get the assigned identifier that it needs. Additionally, other consumers on the stream (i.e. consumers that publish the data to a data warehouse) really need the message to contain the assigned identifier.

How are people dealing with this problem? Are database-assigned identifiers an anti-pattern in microservices? Should we expose separate microservice endpoints that return database-assigned identifiers so that the coordination layer can make a synchronous call to get an identifier before calling the asynchronous insert? We could use UUIDs but our DBAs hate those as primary keys, and they couldn't be used as an order number or other user-facing generated ids.

gadams00
  • 771
  • 1
  • 10
  • 24

2 Answers2

1

If you can programmatically create the identifier earlier while receiving from the message source, you can embed the identifier as part of the message header and subsequently use the message header information during database inserts and in any other consumers.

But this approach requires a separate verification by the other consumers against the database to process only the committed transactions (if you are concerned about processing only the inserts).

Ilayaperumal Gopinathan
  • 4,099
  • 1
  • 13
  • 12
-1

At our company, we built a dedicated service responsible for unique ids generation. And every other services grap the ids they need from there.

These generated ids couldn't be used as an order number but I think it's shouldn't be used for this job anyway. If you need to sort by created date, it's better to have a created_date field.

One more thing that is used to bug my mind with this approach is that the primary resource might be persisted after the other resource that rerefence it by the id. For example, a insert user, and insert user address request payload are sent asynchronously. The insert user payload contains a generated unique id, and user address payload contains that id as foreign reference back to user. The insert user address might be proccessed before the insert user request, but it's totally fine. I think it's called eventual consistency.

tiengtinh
  • 121
  • 7