1

I am trying to understand the pros and cons when having a single data ingestion microservice versus multiple individual microservices for each source of data.

The context: There are multiple sources of data that I need to get retrieve customer data from the first time they register onto my platform; however, each source, for example, Strava, Garmin, Endomondo (Sources of fitness data) they have different methods of pulling data, some of which are more complex than others.

Pros for single data ingestion:

  • Fewer microservices would be present, so possibly fewer integration issues
  • Less time spent in development
  • "fewer" teams need to be in charge since there is only one service (if we follow the one team per service rule)

Cons for single data ingestion:

  • Harder to pinpoint failure in a service
  • Availability for all the data sources could be compromised since there is technically a single point of failure
  • As more sources appear in the future, the codebase turns into a mini monolith

Current Decision From the pros and cons, having individual services for each source looks like a better option looking at the facts. If I was to go ahead, I was thinking of using the:

  1. API gateway pattern to encapsulate the individual microservices.
  2. Shared database pattern to store authentication tokens, for example
  3. Asynchronous messaging pattern to send the data retrieved from the sources to the final destination

I am looking forward to hearing if I left out any pros and cons for the argument of adoption or some counter points!

Mahir Hiro
  • 135
  • 1
  • 7

1 Answers1

1

First of all, I have to say that in my opinion, you chose the correct option, and the greatest advantage of this option is that you are not coupling the different sources and because of that, the APIs of the different providers can change or some of them disappear or as you say before, more sources appear and the rest of your sources wouldn't be affected at all. No source code changes and no newer releases to fix that. And using the asynchronous messaging pattern guarantees that your final destination isn't either coupled to your sources.

The only point I do not totally agree with is the Shared database pattern, the token used to authenticate in each provider is the same? In that case, it could be necessary, anyway if the only data that needs to be persisted and shared between the services is the token I would use a distributed cache like redis that is faster than a relational database

JArgente
  • 2,239
  • 1
  • 9
  • 11
  • No they are not the same but I can just use different tables in a single DB to deal with different providers right? – Mahir Hiro Jul 11 '21 at 14:01
  • 1
    Yes you can, but it not should be a good practice to share database (schema) between microservice because doing so you are coupling the microservices through the database. For example if you change any table, or relation etc because one service need it, you have to modify and deploy also the other microservices to adapt the changes – JArgente Jul 11 '21 at 14:04
  • That makes sense, is there any case you tend towards using a shared DB for microservices? – Mahir Hiro Jul 11 '21 at 14:08
  • No, I dont know any general case, but perhaps it could make sense if, as I said in my answer, you have to share some information between the services like an authorization token for example – JArgente Jul 11 '21 at 14:13