0

For example,

  • You have an IT estate where a mix of batch and real-time data sources exists from multiple systems, e.g. ERP, Project management, asset, website, monitoring etc.
  • The aim is to integrate the datasources into a cloud environment (agnostic).
  • There is a need for reporting and analytics on combinations of all data sources.
  • Inevitably, some source systems are not capable of streaming, hence batch loading is required.
  • Potential use-cases for performing functionality/changes/updates based on the ingested data.

Given a steer for creating a future-proofed platform, architecturally, how would you look to design it?

2 Answers2

1

It's a very open-end question, but there are some good principles you can adopt to help direct you in the right direction:

Avoid point-to-point integration, and get everything going through a few common points - ideally one. Using an API Gateway can be a good place to start, the big players (Azure, AWS, GCP) all have their own options, plus there's lots of decent independent ones like Tyk or Kong.

Batches and event-streams are totally different, but even then you can still potentially route them all through the gateway so that you get the centralised observability (reporting, analytics, alerting, etc).

Use standards-based API specifications where possible. A good REST based API, based off a proper resource model is a non-trivial undertaking, not sure if it fits with what you are doing if you are dealing with lots of disparate legacy integration. If you are going to adopt REST, use OpenAPI to specify the API's. Using this standard not only makes it easier for consumers, but also helps you with better tooling as many design, build and test tools support OpenAPI. There's also AsyncAPI for event/async API's

Do some architecture. Moving sh*t to cloud doesn't remove the sh*t - it just moves it to the cloud. Don't recreate old problems in a new place.

  • Work out the logical components in your new solution: what does each of them do (what's it's reason to exist)? Don't forget ancillary components like API catalogues, etc.
  • Think about layering the integration (usually depending on how they will be consumed and what role they need to play, e.g. system interface, orchestration, experience APIs, etc).
  • Want to handle data in a consistent way regardless of source (your 'agnostic' comment)? You'll need to think through how data is ingested and processed. This might lead you into more data / ETL centric considerations rather than integration ones.

Co-design. Is the integration mainly data coming in or going out? Is the integration with 3rd parties or strictly internal?

If you are designing for external / 3rd party consumers then a co-design process is advised, since you're essentially designing the API for them.

If the API's are for internal use, consider designing them for external use so that when/if you decide to do that later it's not so hard.

Taker a step back:

  • Continually ask yourselves "what problem are we trying to solve?". Usually, a technology initiate is successful if there's a well understood reason for doing it, which has solid buy-in from the business (non-IT).
  • Who wants the reporting, and why - what problem are they trying to solve?
Adrian K
  • 9,880
  • 3
  • 33
  • 59
  • This is really useful advice. Any comments on building with the future in mind, e.g. is the end goal for all architectures event-driven, or will there always be a need for data driven? – Interested Developer Jul 23 '21 at 08:14
  • I'm not sure regarding event vs data driven specifically, BUT I do know that you should always understand the problem first, then match the correct solution to it. You may have heard the phrase "to a hammer, everything looks like a nail". In terms of the future - the more tools you have the better. Being dogmatically wedded to a single idea usually ends badly. – Adrian K Jul 24 '21 at 06:01
1

As you mentioned its an IT estate aka enterprise level solution mix of batch and real time so first you have to identify what is end goal of this migration. You can think of refactoring applications. If you are trying to make it event driven then assess the refactoring efforts and cost. Separation of responsibility is the key factor for refactoring and migration. If you are thinking about future proofing your solution then consider Cloud for storing and processing your data. Not necessary it will be cheap but mix of Cloud and on-prem could be a way. There are services available by cloud providers to move your data in minimal cost. Cloud native solutions are there for performing analysis on your data. Database migration service in AWS or Azure can move data and then capture on-going changes. So you can keep using on-prem db & apps and perform analysis for reporting on cloud. It will ease out load on your transactional DB. Most data sync from on-prem to cloud is near real time.

TusharK
  • 61
  • 4
  • So agree on the cloud comment, I think the intention would be to integrate all data into a cloud platform, migrate applications to the cloud where possible/feasible, and then ingest external app data. What are your thoughts on how event driven architecture will evolve over time, e.g. will it be the best-practice method for ALL architectures, regardless of data frequency and latency? – Interested Developer Jul 26 '21 at 13:25
  • @InterestedDeveloper Event driven architecture is not a new concept, earlier(even now) we used to have ESB, MQs, Listeners etc to perform some async operations, using all on-prem resources, now its just moved to cloud/containers and got fancy name. It is going to stay here for long. Based on client need we can decide if event driven is required. Yes, Latency, data frequency and many other factors like OLTP, OLAP, DWH etc should be accounted for designing the solution. It will always be a mix and match and if NOT implemented properly then there will be lot of technical debt. – TusharK Jul 27 '21 at 13:06