2

After reading/watching a lot about Event Sourcing, there is one thing I don't fully understand: Events that lead to triggering of other events.

Let's take StackExchange's "close question" feature as an example. The feature requires

  • commands like VoteToCloseQuestion, and CloseQuestion (direct close for moderators), and
  • events like VotedToCloseQuestion and ClosedQuestion -- note the use of past participle to express "historical facts".

The crucial point: If we reach a state of having 5 VotedToCloseQuestion, where does the event ClosedQuestion come from exactly?


Often the architecture of Event Sourcing + CQRS is presented in diagrams like:

(from a talk by Dennis Doomen)

(from a talk by Mathew McLoughlin)

What surprises my in these diagrams. The command handler has no knowledge of past events apparently. Because of that, I fail to see how events triggering other events can work exactly.

So far, the best hint I have found was this Q/A. The top-voted answer mentions the notion of "Event Handlers" that are responsible for feeding commands back into the system as a reaction to events. In an attempt to make this more concrete, I came up with the following interpretation (focusing on the command side only, pipes are topics/queues, solid arrows is publish, dashed arrows is subscribe):

However, such an architecture has some weird implications:

  • The CommandHandler is only subscribed to the commands topic. In particular, a CommandHandler itself is not subscribed to the event stream itself. This means that the CommandHandler cannot know the full state of the system. As a result, the only decisions it can make are stateless transformations of commands to events -- which is surprisingly boring.
  • The EventHandler on the other side listens to the stream of events, so it can compute the entire application state (which it may store in materialized view local to the EventHandler service). This is the place where the business logic can live, and where we can make decisions to trigger follow-up commands.

Looking again at the introductory example, the sequence of commands/events would become:

  • Command: VoteToCloseQuestion
  • CommandHandler issues event: VotedToCloseQuestion
  • EventHandler increments internal vote counter, but does nothing because < 5
  • ... 3 more ...
  • Command: VoteToCloseQuestion
  • CommandHandler issues event: VotedToCloseQuestion
  • EventHandler increments internal vote counter, sees the count is 5, and issues the CloseQuestion command
  • Command: CloseQuestion
  • CommandHandler issues event: ClosedQuestion

It seems to work, but I'm not sure if I'm on the right track. In particular I'm wondering:

  • Does the separation into commands and events really add value like this? It feels like I have just duplicated every command as an event again, and since the CommandHandler cannot contain significant business logic due to lack of context, it becomes mostly boilerplate?
  • Others (e.g. the CQRS FAQ) seem to imply that the business logic is in the command handler. How is this possible if it cannot know the application state?
bluenote10
  • 144
  • 6
  • It would be nice if it's possible to answer this question without using too much DDD terminology, since I find many of its terms quite vague. – bluenote10 May 04 '20 at 18:59
  • `The CommandHandler cannot know the full state of the system` -- Correct. That's what the purpose of the Event Store is. The Event Store allows you to "play back" the events to get the history of the state changes. – Robert Harvey May 04 '20 at 19:16
  • `It feels like I have just duplicated every command as an event again` -- If it helps, think of a command as a "patch panel" of sorts. A command can be patched into a bit of code that generates an event, but it can also be patched into some other piece of code that, say, disables a button. – Robert Harvey May 04 '20 at 19:27
  • Alternative example: [`ConfirmOrder` (command) vs `OrderConfirmed` (event) from the CQRS FAQ](https://cqrs.nu/Faq) stating: _Unlike an event, a command is not a statement of fact; it's only a request, and thus may be refused._ So if the CommandHandler lacks the context to refuse an event, it can only issue an `ConfirmOrderReceived` event or so. Later the EventHandler has the necessary context to translate that into a new command `ReallyConfirmOrder`, and finally the CommandHandler can publish the `OrderConfirmed` event? The job of the CommandHandler feels redundant somehow. – bluenote10 May 04 '20 at 19:32
  • A command has but one purpose: *serve as a subscriber endpoint for a user or system-initiated action.* That's all it does. Normally, I'd be the last person to invoke Uncle Bob here, but see https://en.wikipedia.org/wiki/Single-responsibility_principle. See also https://en.wikipedia.org/wiki/Command_pattern and https://en.wikipedia.org/wiki/Observer_pattern – Robert Harvey May 04 '20 at 19:36
  • The reason you're having difficulty with this concept is that you think that a command and an event are the same thing. They're not, especially in the context of CQRS. – Robert Harvey May 04 '20 at 19:39
  • Alas, I have read the two answers below, and they only add to the confusion. I'm voting to close this post, as I believe it is too broad. If you can make your question *more specific,* I'll retract my close vote. – Robert Harvey May 05 '20 at 15:38
  • Can you be more specific about what you're trying to clarify? I've laid out some groundwork in the comments above, but have not received any feedback on them. – Robert Harvey May 05 '20 at 18:03
  • @RobertHarvey How the flow of events + commands is in the example of "5 vote close make a close question". Ironically I'll rather get these 5 votes instead of an answer. – bluenote10 May 05 '20 at 18:14
  • There is a bit of logic somewhere in your program that keeps track of the previous votes. When a new vote comes in, that logic checks to see if the count has reached 5, and when it does, it generates a CloseQuestion event. Just as you stated in your post. – Robert Harvey May 05 '20 at 18:27
  • Some food for thought: 1. The diagrams you have in your post *don't tell the whole story.* There might be other objects not in your diagrams that perform tasks like determining when a post is closed. 2. The knowledge about whether or not a post has garnered enough close votes to close is located in a database somewhere. That could be the Event database, or it could be a central repository somewhere. Your diagrams don't describe that relationship either. – Robert Harvey May 05 '20 at 18:37
  • In short, the architecture gives you ways to communicate with the objects in your application, but it doesn't have anything to say about *what those objects look like.* You're free to design those objects in any way you see fit. – Robert Harvey May 05 '20 at 18:39
  • And by that I mean that one of your domain objects has the responsibility of working out whether or not a post is closed. Neither the commands nor the events have that responsibility. – Robert Harvey May 05 '20 at 18:52

3 Answers3

4

Disclaimer: I don't promise that I've read all of that big-wall-o-text....

Let's take StackExchange's "close question" feature as an example. If we reach a state of having 5 VotedToCloseQuestion, where does the event ClosedQuestion come from exactly?

The event is computed by the domain model in response to the 5th CloseQuestion command.

Imagine if you will that we receive a message that reads CloseQuestion(409686, Erica). We use a repository to access the history of 409686, that history looks something like

OPENED
VOTED_TO_CLOSE(Alice)
VOTED_TO_CLOSE(Bob)
VOTED_TO_CLOSE(Charlie)
VOTED_TO_CLOSE(Deidre)

When domain model applied the CloseQuestion command to this new history, the result looks like

OPENED
VOTED_TO_CLOSE(Alice)
VOTED_TO_CLOSE(Bob)
VOTED_TO_CLOSE(Charlie)
VOTED_TO_CLOSE(Deidre)
VOTED_TO_CLOSE(Erica)
CLOSED

And the history of 409686 is updated to match this result.

That's the basic plot: the information we have stored under the id is used to feed back into the next computation. The result of that computation is then stored to be reused later.

The fact that we are "event sourcing" doesn't really change the basic flow. We still compute a new state by applying the new information to the old state.

Now, I'm going to point out, as gently as I can, that it is not your fault that you can't work this from the pictures included in your question. The literature in this space, taken as a whole, really sucks. So you get pictures like these, which are missing the arrows that would show how past history gets re-used, and wouldn't confuse the relative arrangements of the handler, domain model, and repository, and so on.

Some of that is historical artifact; when people talked about the domain model as a layer, and imagined that they would be dealing with an object store as their persistence appliance, rather than a document store. They weren't thinking so much about stateless processes, and distributed access, and concurrency, and so forth.

Does the separation into commands and events really add value like this? It feels like I have just duplicated every command as an event again, and since the CommandHandler cannot contain significant business logic due to lack of context, it becomes mostly boilerplate?

Right - the command handler is application; it doesn't do any of the useful modeling itself, but rather controls the flow of information between the data store, the domain model, the response to the message, other third party information sources/sinks.

The logic that actually drives the competitive advantage you accrue is in the domain model, not the handler. Think "separation of concerns"; you can change the domain logic and deploy a new model without needing to change the command handler. Similarly, you can take the domain model, and lift it into a different application context.

But it's all still happening "upstream" of the event store, on the "write" side of the picture.

The EventHandler on the other side listens to the stream of events

OK, remember how I said the literature sucks? Keep that in mind....

One of the problems that the literature has is that the word Event means a number of different things in different contexts. The things that we copy into our durable storage so that we can remember them later (events) are not necessarily the same thing as the things that we publish (events) so that others can react to our changes.

Another problem is that command and event are, semantically, somewhat dual to one another

handle(Event) is a command
CommandReceived is an event

A message in my inbox is a command, and a message in my outbox is an event. Easy. But what is the message that has been copied from my outbox to your inbox. Is that third message an event? a command? something else? Does putting the label on it change anything?

There are these vague DDD terms again. "Domain model" is that a microservice?

No, not quite. "Domain" here is "the subject area to which the user applies the program". Accounting, sales, billing, airline booking, would all be examples of domains.

A model is a selectively simplified and consciously structured form of knowledge. An appropriate model makes sense of information and focuses it on the problem

A domain model is not a particular diagram, it is the idea that the diagram is intended to convey. It is not just the knowledge in the domain expert's head; it is a rigorously organized and selective abstraction of that knowledge.

Martin Fowler's definition is a bit more specific:

An object model of the domain that incorporates both behavior and data.

Domain models predate microservices; historically, they arrive at roughly the same time as "service oriented architecture", but as far as I can tell that's coincidence -- just one of a number of ideas that people started experimenting with during the dot-com boom at the end of the 1990s.

Aren't your questions at the end exactly my questions?

No, they are a rhetorical device intended to help you to see that investing in the ability to distinguish "events" from "commands" is largely a waste of time. Or, put another way, you can replace both of them with the abstraction "message" without losing important fidelity.

Formally, if the CommandHandler is a pure function of a Command, producing zero or more Events, there is no difference in storing either the functions input or output in terms of its information value.

Careful. Time and change put pressure on that claim. Imagine a state machine where, after it has processed a few inputs, we now deploy new logic. Do we want the state machine to magically teleport to a new state? Or do we want the state machine to continue from its current state, following the new rules?

Unless you are very careful, input storage gives you the magic teleport behavior.

A different claim, consistent with what we said earlier, is that it doesn't particularly matter whether the input messages are "commands" or "events".

VoiceOfUnreason
  • 32,131
  • 2
  • 42
  • 79
  • Hm well the general plot was clear. All that "big-wall-o-text" was really about the details of these "missing arrows". – bluenote10 May 04 '20 at 21:47
  • I tried to extend the answer to address more of your edge concerns. Let me know if that helped. – VoiceOfUnreason May 04 '20 at 22:23
  • There are these vague DDD terms again. "Domain model" is that a microservice? What's the message flow to/from it? Are you implying my statement is wrong? Aren't your questions at the end exactly my questions? Oh boy, have I failed at getting this question across. – bluenote10 May 05 '20 at 05:59
  • Took another try. – VoiceOfUnreason May 05 '20 at 10:32
  • Interesting that you come to a similar conclusion regarding distinguishing events and commands. Others seem to have a very strong opinion that this separation is important, see the comments on the question itself. Formally, if the CommandHandler is a pure function of a Command, producing zero or more Events, there is no difference in storing either the functions input or output in terms of its information value. In practice, the picture might be different if side-effects come into play, or if either storing the function input or output has practical pros/cons. – bluenote10 May 05 '20 at 18:57
  • Thanks to you both for making this discussion a thing. Great content here. – MMalke Sep 02 '20 at 17:32
1

I would paraphrase your concerns as follows:

  1. A (mostly) one-to-one correspondence between commands and events not only requires more (boilerplate) code but also seems redundant.

  2. Having the business logic in the command handler seems the wrong place.

Here is my 10 cents worth.

Point 1. Yes, there is duplication. One principle of software development is to find the right abstraction (which can be very difficult - the wrong abstraction can cause nightmares). With SQL, the abstraction is trigger functions executed at certain points (BEFORE/AFTER and INSERT/UPDATE/DELETE). It's abstract because the trigger function itself has to impart meaning from the event. E.g., if OLD.is_email_confirmed = false and NEW.is_email_confirmed = true, then the user must have clicked "Verify Email Address" in the registration email, so we now need to queue a new welcome email by issuing INSERT INTO email_queue (user_key, content) VALUES (...). The system is not responding to a more specific (and semantic) EmailWasConfirmed event and is not issuing a more specific (and semantic) QueueWelcomeEmail command.

My experience with event sourcing so far is that it works best when the problem and solution have a well-understood, limited scope. In most applications, management change the behaviour (often without thinking things through) and this also requires a changes to the properties. Having commands and events for each property IS redundant. I have changed my approach to use a primary key that includes a revision number, allowing me to store multiple copies of an entity. E.g.:

CREATE TABLE user
(
    uuid UUID NOT NULL,
    revision INTEGER NOT NULL,
    username CHARACTER VARYING(126) NOT NULL,
    ...
);

CREATE PRIMARY KEY ON TABLE user (uuid, revision);

CREATE UNIQUE INDEX ON user (username);

In the above scenario, saving the user involves inserting a new record with an incrementing revision. One argument against this is that it adds a lot of overhead, but on the other hand, having one command/event per property is not only a lot of overhead for the programmer, but storing all these events doesn't really save a whole lot of storage. Some people will further argue that in properly event sourced applications, a single events could encapsulate semantic changes to multiple properties, and that's true, but some applications really are just CRUD, either because of their nature, or because they end up with CRUD because management change their minds every month. The rules in business software, such as when to offer a marketing promotion, are not as precise mathematical rules (think Pythagoras' theorem), and therefore you'll end up with data that broke rules you previously thought were invariants.

Point 2. In the above SQL trigger example, the business logic was in the event handler. An analogy that helped me understand it is to think of a scenario where there's a million employees, each with a single job. Computers don't exist. Employees "store" data by writing it on a personal notepad, and only communicate with each other by passing hand-written notes. Each "vote" to close a question is an employee. They record who voted for the close and which question was voted for. They then write a note and pass this to the "close vote counter" employee (one employee per question). They job is to simply count the number of close votes, and if it reaches 5, write a note to the "question" employee to close the question. I have explained it poorly, but this is the "actor" model, and it deals with event sourcing.

I've since moved on from event sourcing, actor systems, etc, although I still find the principals interesting. Just get the job done, because even the cleanest architecture in the world will be messed up by people who don't understand or don't agree with it, or the business ages and those who once understood are no longer employed there. Stick with the tried and true - stateful entities persisted in an SQL database (or databases, according to bounded context).

magnus
  • 664
  • 1
  • 5
  • 14
0

You can just need to add a listener and a feed to your pattern. How you do this maybe dependent on your implementation, most commonly I've seen this in the form of writing multiple projections/events from the repository (or event store depending on which diagram in your question you are looking at). More recently I've seen more event streaming (e.g. Kafka) implementations. I've also seen it in the form of using database triggers, but … sniff sniff.

If you take the example below, the repository is producing two projections for two different event handlers, one event handler is writing the projection to the query model and the other is doing something else that needs doing. That could be interfacing with the domain to achieve something or messaging a completely different component.

enter image description here

This allows for events to trigger events inside or outside the context of a single system. In you specific example of having 5 votes to close a question, you would expect to see 5 requests of 'VoteToCloseQuestion' come through the Command API and the Command Handler would issue 5 commands of 'VoteToCloseQuestion' to the Domain. The Domain is where all your business logic lives (I'd also say that the Command Handler exists in the Domain, but I've tried to keep the component terminology and granularity the same as your own diagrams).

The Domain then tells your Repository of 0 to many events as a result of the command. The simple case of a 'VoteToCloseQuestion' will probably lead to a single event sent to the Repository. The Repository then adds an event to the Event Store.

The Repository in this instance is also responsible for outputting (projecting) the state of a record to be recorded in the Query Store. i.e. the Query Store doesn't hold events of 'VoteToCloseQuestion' it holds an aggregate of 'NumberOfVotesToClose'. The Repository creates aggregates/state by replaying the events in the Event Store and sends a projection of current state to an Event Handler to update the Query Store.

By creating a second projection of current state (or piggy backing on an existing one), another Event Handler can act upon certain state changes, e.g. when 'NumberOfVotesToClose' reaches 5. One of these actions could be to issue another command 'CloseQuestion'.

Going back through the loop, this command correlates to a 'CloseQuestion' event being stored in the Event Store. The state of 'QuestionStatus' is then Projected as 'Closed'. If the Domain ever needs to know the current state of 'QuestionStatus,' to decide whether to accept a command for instance, it asks the Repository which will replay the events in the Event Store to get the state.

K Mo
  • 241
  • 1
  • 7
  • This is just another version of these vague diagrams that I don't understand. What are these vague components "Domain" or "Repository" (what are they? microservices?). I picked a concrete example in the hope of moving beyond this high level perspective. I'd be really curious to play out the message flow based of such a system based on a simple example. – bluenote10 May 05 '20 at 06:03
  • @bluenote10 I've updated my answer explaining a bit about the message flow. From your question I thought you understood what the components you mentioned were and I tried to use the same terminology. The components are as large or as small as they need to be in context of the application they represent, but both the Domain and Repository are bigger than micro services (although they may be a collection of micro-services). The Domain in this context is where all your business logic sits while the Repository is your interface to the physical data storage (e.g. Axon Server). – K Mo May 05 '20 at 09:10