Confusion about the meaning of the word aggregate in domain driven design

Question

In a discussion about domain driven design I have learned the different people seem to think of different things when using the word aggregate. The main difficulty is that some people use the word aggregate for what other people call aggregate type.

It is quite difficult to have a discussion if people assume different meaning for the same words. For this reason I set out trying to clarify on what most people and the literature agrees too. If you give an answer to this question I would be very happy if you could provide a reference to literature.

For one person an aggregate is the boundary that groups a collection of entities. It is more a conceptional clustering boundary.

For another person an aggregate is a collection of entities transfered from a database repository (having transitional consistency). So an aggregate is something real and not just a concept. If I for example load two users from a database then I have loaded two aggregates of the same aggregate type.

Another person that also thinks that a collection of entities that are transactional consent but thinks that if you load data of a given aggregate type you can also load it partially (with some data just null for example) and still call the whole thing one aggregate while others would see this as two aggregates (with eventual consistency, meaning the consistency is given after both aggregates are saved).

To find the true meaning of the word aggregate myself I have had a look at the definition of Martin Fowler. Here an aggregate is something real and there can be two aggregates of the same aggregate type. But when reading something like this article from Vaughn Vernon I get the impression that he calls aggregate what according to the 'Martin Folwer like interpreted understanding' should be called aggregate type.

score 8 · Accepted Answer · answered Dec 17 '15 at 18:31

For terminology in Domain Driven Design, start from "the blue book" -- Domain Driven Design by Eric Evans.

AGGREGATE A cluster of associated objects that are treated as a unit for the purpose of data changes. External references are restricted to one member of the aggregate, designated as the root. A set of consistency rules applies within the aggregate's boundaries.

That last sentence, I think you can turn around -- the boundaries of the aggregate are defined by the consistency rules.

It's definitely the case that an aggregate has state. Each time the domain model changes, an aggregate is taken from one consistent state to another. The data that we persist is used to reconstruct this state. So in that sense, it is a real thing.

But the aggregate itself doesn't necessarily have a word in the ubiquitous language. It's a derived concept.

Broadly, we could put the entire domain model under a single aggregate, that enforces all of the consistency rules. We don't, because it that design doesn't scale: we can't change the domain model two different ways at the same time, even when the changes we are making don't share any consistency rules. It's a poor way to model a business that can do more than one thing at a time.

Instead, we decompose the consistency rules into sets, subject to the constraint that two rules that reference the same data must be part of the same set. (In doing this, we are also working with the ubiquitous language and the domain experts to determine if we are correctly describing the consistency rules).

To update the model, we identify the aggregate responsible for a piece of data and propose the change. If the aggregate verifies all of its local consistency rules, we know that the change is globally valid, and we can apply the change. This restores our ability to do more than one thing at a time - changes to data in different aggregates can't possibly conflict with each other, by construction.

Best practices suggest that most aggregates should contain only the root entity. So you can conflate the aggregate with the entity without too much risk. But my guess it there won't usually be anything in the ubiquitous language to hang on the cluster when it includes more than one entity; so you end up with the ShoppingCart aggregate maintaining the consistency rules for the ShoppingCart entity and the CartItems entity collection and....

Partial loading of an aggregate is broken when trying to apply a change -- how could a well designed aggregate possible validate all of its consistency rules with a subset of the data? It's certainly the case that, if you have a requirement where this makes sense, your modeling is broken somewhere.

But if you are doing a read, loading only some of the data guarded by the aggregate can make sense. Command Query Responsibility Separation (CQRS) takes this a step further; once the model has verified that the data satisfies the consistency rules, you can completely rearrange that data into whatever read only form makes your life easiest. Put another way, if you aren't concerned with data changes, you don't need to worry about the aggregate boundary at all.

score 6 · Answer 2 · edited Dec 21 '20 at 21:39

6

There is no "database" or "transaction" in DDD. DDD is completely agnostic to databases, transactions or "eventual consistency". So any definition that includes those is not valid for DDD. That basically leaves only your first definition, which I feel is the correct one. Also, I don't see how you can feel Martin Fowler's description fits anything other than the first definition.

But I can see how confusion can creep in when you think about "practical DDD". That means DDD + infrastructure build on top of it. For example, it might not make sense to materialize whole aggregate from database if you are working with just part of it. But then, I would question if the aggregate is really properly designed. Maybe it should be split in multiple aggregates.

Either way, if you start bringing infrastructure into it, it stops being a concept of DDD and becomes a different concept. The only thing you can do is to realize this and make sure everyone on team agrees what "aggregate" means and what properties it has. Remember, names are primarily for efficient communication, and communication with your team has priority against communication with the outside world.

edited Dec 21 '20 at 21:39

Robert Harvey

198,589
55
464
673

answered Dec 17 '15 at 15:24

Euphoric

36,735
6
78
110

I feel that Martin Fowler means something like the second description because he says: "Aggregates are the basic element of transfer of data storage - you request to load or save whole aggregates. Transactions should not cross aggregate boundaries." If you can load or save an aggregate it does not fit with the first idea of a clustering concept only. – Sjoerd222888 Dec 17 '15 at 15:38
I don't get why you need to load the whole aggregate just to enforce rules. I mean if I understand well, we are talking about invariants which must be enforced and to do that you need probably some properties of entities or value objects, not all of them. You just need to know which ones you need to enforce the invariants, not more. But maybe I am wrong about this. – inf3rno Oct 16 '17 at 18:41
Unless we need to check by every change whether the new value is the same as the old was. – inf3rno Oct 16 '17 at 18:51
"There is no "database" or "transaction" in DDD." -- Could you please elaborate on this. As far as I can tell the "Aggregates" section of chapter 6 of Eric Evan's DDD book is devoted to discussion of how DB transactions define boundaries of aggregates. Just to give one quote: "Invariants, which are consistency rules that must be maintained whenever data changes... But the invariants applied within an AGGREGATE will be enforced with the completion of each transaction." – Myk Jul 07 '18 at 00:30

score 4 · Answer 3 · answered Dec 17 '15 at 19:25

It looks to me like Vaughn is struggling with the practical problems of monolithic aggregate roots in 'line of business' style software.

Where as Martin likes monolithic OOP software, which works well when you can fit your whole process into memory.

I don't think there is a contradiction between the two though. You have to choose where you split your domain objects into contexts and aggregates in DDD. On one hand you want to have aggregate roots which encompass an entire real life process, on the other you have practical difficulties if these get too big, so you may have to split them out a bit smaller and join them up with domain events or other trickery

Confusion about the meaning of the word aggregate in domain driven design

3 Answers3

Linked