1

We are decomposing a monolithic legacy system into microservices. As we do so, we can't completely remove reliance on some of the data in the legacy system that's required for each microservice.

Should each microservice integrate directly with the legacy database to read and write data under its domain or should we have a single integration service responsible for fetching and writing data to the legacy system?

DaveO
  • 136
  • 6
  • 2
    Does this answer your question? [What is an Anti-Corruption layer, and how is it used?](https://softwareengineering.stackexchange.com/questions/184464/what-is-an-anti-corruption-layer-and-how-is-it-used) – Dan Wilson Jun 24 '20 at 07:14
  • @DanWilson Thanks. Im assuming I'll need some abstraction layer to avoid introducing legacy concepts into the new domain model, but I'm primarily interested in whether this should be spread across microservices or consolidated into one. I'd prefer each microservice manages this integration itself but am also wary this might introduce inefficiencies or constraints in some way I'm not aware of. – DaveO Jun 24 '20 at 08:45
  • 3
    I'd say the answer is *it depends*. Throw it all on a whiteboard and see what makes sense for your situation. One-size-fits-all likely does not apply here. – Dan Wilson Jun 24 '20 at 19:12

3 Answers3

1

(This is not meant to be a comprehensive "cookbook" or "list" answer of all possible design approaches. Simply an answer with one possible design approach.)

Frequently when dealing with a legacy system with a legacy database the database contains and maintains facts about a number of different concerns. Possibly they're closely related, or possibly not, but in any event you can actually separate these concerns and when you do you may also find that the way you interact with these concerns are different.

With respect to "the way you interact...are different" consider that even for the same set of information sometimes you're doing a transactional update on a single item, sometimes you're doing a specific query that will return 1 to 100 items without updating any, and sometimes you're generating a report that will return 1000000 items without updating any. Also sometimes you're interacting directly with only one of these concerns and sometimes you're interacting with two (or possibly more) at the same time. (See below list.)

In which case you can effectively set up a new microservice to be the owner of each concern, and each will talk to the legacy database of course in the proper way, but each will expose an API that is directly in the domain language of that concern.

By providing these "intermediate" microservices your system then decomposes more readily into "services handling particular needs" and you gain not just insulation from the legacy implementation but also modularity (with all that gives you) in your evolving system. And you might find that some constraints are loosened leading to a better operational environment all over - for example, you may discover a way to handle transactions fast without interference from your reporting needs, etc. etc. etc.

This is all very abstract - so here are some concrete examples:

  • Legacy database contains information about hotels. Room availability today, room availability in the future, historical room availability, and then maybe also room rates, upcoming promotions, financial information like taxes of various kinds that are paid for each room rate, geographical location, which chain the hotel is part of, etc. etc.

  • Legacy database contains information about inventory. What's currently in inventory, what's expected and when, what's currently in flight, damaged inventory, lost inventory items, inventory that needs manual handling, inventory age and when the inventory ages out and needs to be discarded, who owns the inventory, price of each item in the inventory (including, or not, historical prices), etc. etc.

    • Example of dealing with multiple facets: Report on SKUs that are problematic in that they are frequently broken or lost in the warehouse and/or aren't supplied in a timely fashion, or cause other problems?
  • The legacy database contains information about customers. Contact information, what's currently in his cart, last time seen, order history, customer service history, financial information, etc. etc. etc.

    • Example of an action dealing with multiple facets here: Should the customer be charged for his return of this item from this order, taking into account how often he orders from us and how much, and also taking into account does he seem to be "abusing" this privilege for a certain class of items?
  • And so on.

Glorfindel
  • 3,137
  • 6
  • 25
  • 33
davidbak
  • 712
  • 1
  • 7
  • 10
0

Each microservice should integrate directly with the legacy database:
Because you can! (as implied by your question).
Reasons:

  1. Adding an integration service is an additional part, and it appears not to be required by your solution.
  2. The microservices integration results in a simpler system - this ignores any potential benefits of abstraction where the microservices code for the legacy database interface might be more complex.
  3. If / when the legacy data ages out or is replaced, microservices can be updated to ignore the legacy data. The integration service should become redundant, but will you know for sure? It could become a liability, unless it is not there!
-1

I would say that as a rule of thumb it would be much better to develop your new services with their own persistence layer, as per the microservice development best-practices. Data isolation is a PRIMARY TENANT of microservice development.

Now this being said, you will not always be so lucky as to have a scenario which enables this, as is your particular case. In a situation like yours, I would still recommend that as you move forward with the process of breaking up your megalithic application; you build in dedicated persistence for each service, and utilize as you mention above, an integration service which is responsible for ensuring eventual consistency with your legacy persistence layer.

If you are developing within a cloud-native environment, you can use a serive such as Azure Cosmos DB which offers an amazing feature which they call the "Change Log". This feature tracks all DB operations and fires off events which you can then listen for using a dedicated integration service. The integration service could even be implemented using a serverless technology, on the Azure platform this would be via an Azure Function App, which comes out of the box with triggers for the Cosmos DB Change Log. This serverless strategy would keep your costs EXTREMELY LOW as they will give you ONE MILLION requests/month FOR FREE!

The primary downside of utilizing the integration service in this manner, would be that you would be required to architect the software with eventual consistency in mind. Meaning that the portions of your application which are still utilizing the legacy data tier, will be slightly out of sync with your newer services that utilize the proper service based design principals. For most use-cases this should not be much of a problem, as the databases should be consistent after only several seconds if you are to go the Azure Functions route described above.


Additional Remarks... (updated)

I am updating this answer with the following link to a MSDN blog post that I happened to come across several hours after writing up this answer, and thought that it might give you some deeper insight into the whole issue of Data Syncronization!

Forecast: Cloudy - Branch-Node Synchronization with SQL Azure, Part 2: Service-Based Sync

  • Sorry this doesn't answer the question. My question is whether we should have each service integrate with the legacy layer independently, or use a "legacy integration" microservice dedicated to accessing legacy data. – DaveO Jun 28 '20 at 08:34
  • Maybe my answer was not clear enough, but what I was trying to say is that each service SHOULD NOT integrate directly with the legacy layer, as this is a MAJOR violation of the whole microservice design principals. Each microservice should have it's own DEDICATED persistence layer which can only be accessed by an external service via APIs that the owning microservice exposes. This would then lead you to expose APIs for accessing the legacy data layer, which is what i was suggesting utilizing a serverless approach such as azure functions for providing said legacy access API. – Chaplin Marchais Jul 04 '20 at 07:59