Information sharing in event-based systems

Question

I'm developing a software system, which provides HTTP REST API, and I want to achieve a very modular and flexible design. So, there is a core functionality and feature modules, which handle only the logic related to each specific feature. The goal is to be able to add/remove modules in a flexible manner without the need to manually connect each module to the core and to other modules. Therefore modules should not have tight dependencies between each other and the core should be very small and it should not depend on any feature.

I've found that event-based architecture is a good candidate for my goals.

However, I have some difficulties in implementing it in a robust manner.

1. How to approach the information sharing?

Initially, when some logic is executed inside the application (e.g. during user request processing), there is a context around it. For example, some entities were loaded from the database (I'm using ORM library for this). And then I want to fire an event.

What data should I pass with the event? I can see two approaches:

Pass related entities with the event
Pass only entity IDs (i.e. primary key, which could be used to fetch the entity from the database)

With first approach, the data could be used from the event right away, this is convenient. However, what if some previous event made some modifications to the data? We can't know for sure, because event handlers are decoupled. So the passed data could be already outdated.

With the second approach we are forcing fresh data to be loaded from the database, because we are only sending IDs and not the real data. So each event handler will need to manually fetch the entities. This solves the problem of data freshness, but it leads to another problem in terms of performance. There could be dozens of event handlers reacting to a single user request and each one will issue requests to the database. This looks like a very ineffective approach, which will put stress on the database.

2. How to handle concurrency (transaction isolation)?

Another problem, is that we need to ensure that our code is safe from the concurrency standpoint. We don't want parallel requests or even different parts of the same process to modify the same data. In order to do this, we need to use database transactions and various locking features.

And again, we could start a transaction and pass it with the event, so the event handler will be able to use the same isolation layer. Or each event handler would have it's own transaction. But, this leads again to the question of database performance.

I was thinking about creating a new transaction for each user request and to pass it to each method, which requires database access (directly or with the events). Therefore each request with all possible event handlers will be executed in a single transaction. Is this a good approach?

Generally, this architecture raises many questions, I would be very grateful for any good books or articles on this subject. Or maybe there is a better approach to achieve my goals in terms of flexibility and modularity?

@Zapadlo the module is a directory with code related to some specific and isolated feature of the application. The module could interface with the core's API and react to some events, effectively enhancing/altering the application behavior. The module could provide it's own entities, request controllers, CLI commands among other things. — Slava Fomin II, Dec 16 '17 at 18:45
Both of your concerns may actually become a big issue because of one primary reason: Modules are not truly separated, they need each other's data and ask each other for confirmation to do their own work. Minimize that and you've got a well decoupled system. — S.D., Dec 18 '17 at 07:44

score 2 · Answer 1 · answered Dec 16 '17 at 19:16

What data should I pass with the event?

Definitely only id. Event is a notification that something happened, it's not intended to be a data container. This approach results in a loose coupling, so when you find out that your database doesn't cope with the load, it won't take too much effort to move that functionality into a separate machine with its own database.

But I find your remark about data reuse a bit suspicious. Most of the time, an event separates quite different activities. It indicates that something happened from the business perspective, which implies that some coherent (sub-)feature or business-process step worked out, and it follows another step. More generally, this represents a concept of saga. It implies each step is carried out in isolation of any other, probably each one residing on its own machine with its own database.

Another valid case for an event is a ddd concept of domain event. It usually implies that one use-case is split on some sub-steps, and each sub-step is implemented in its own Application service, though each step is a part of one database transaction. Here is an example of that technique.

How to handle concurrency (transaction isolation)?

I guess there are two major points worth to be mentioned. First one is the concept of eventual consistency. Basically it's about what I've already covered: splitting your use case on substeps and treat each one ACID-ly. So that you won't have huge transactions, which is a way to tight coupling and deadlocks.

The second one is about identifying your service boundaries. What you referred to as modules is a (micro)service. So your major goal is to identify your service boundaries so that they would be autonomous.

The series of posts covering these issues could be of some interest for you.

Thank you for your answer, I'll study it and the links provided. — Slava Fomin II, Dec 16 '17 at 19:58
You're welcome! If you'll have any questions, feel free to ask. — Vadim Samokhin, Dec 16 '17 at 19:59
I'm not sure if we can tell event notification is the only solution, there's the need of evaluating the eventual consistency requirements for this application. Maybe and event-carried state transfer can be useful. (https://martinfowler.com/articles/201701-event-driven.html) — Fabio, Apr 30 '19 at 21:41

score 2 · Answer 2 · answered Dec 16 '17 at 20:13

How to handle concurrency (transaction isolation)?

If you want to have a modular, decoupled system, the best way to do it is to split the functionality into modules/services which do not require synchronous communication to fulfill their function.

Yes, this disqualifies all the CRUD-based microservices, that are basically just tables with a HTTP API.

That means all the events should be fire-and-forget type of events. Therefore transaction boundaries should never reach over service boundaries, so you will generally not need a distributed transaction.

How to approach the information sharing?

Do not pass primary keys in messages if you are doing RESTful HTTP. Pass a URI where the resource is available.

But again, you should not pass anything that would require the receiving peer to make a query back, or to something else. Split the functionality in a way that each message can be handled in its entirety by the receiver.

Here is some more description of this style: Self-Contained Systems.

Thanks Robert! But in my case, everything is happening inside a single process. Both senders and receivers of events are in the same code base. Event system just helps to decouple the functionality. However, I agree that such design should allow to turn each module into microservice later on if desired. — Slava Fomin II, Dec 16 '17 at 21:23

Information sharing in event-based systems

2 Answers2