DataAccess Layer coupling with Domain Layer

Question

We have an issue with how the implementation of the Data Access layer (EF6 Includes more specifically), influences the behavior of our Domain layer.

A theoretical example to illustrate:

Application in 3 layers, DDD:

Domain layer: simplistic domain model
- An Order has Orderlines
- An Order has a public property "CanBeExtended". Business logic is
  "when there are less than 10 Orderlines, an order can be extended"
- Note: This implementation is unit testable, no dependencies
Data Access layer: this layer contains the OrderRepository, which uses EF6 to retrieve an Order object from the database.
- GetOrderById is the only relevant method for this example
- References the Domain layer.
Service layer: contains "Controllers" in MVC-speak, is accessed via the API
- References both Data Access and Domain layer
- Exposes a method "CanOrderBeExtended": input is the id of the order, returns a bool whether the order can be extended

Implementation of this Service layer: Retrieve the Order object from the repository using GetOrderById, then call its property CanBeExtended, and return that value.

Problem: Depending on the implementation of the Data Access layer, the Domain layer behaves wrong: depending on whether you Include the Orderlines of the Order in GetOrderById, the CanBeExtended behaves differently.

The Bug: When you forget the Include, all Orders are extendable.

Possible solutions:

Lazy loading is considered, but it's not our favorite. It's a technical detail of EF6, and it might turn out to be problematic in a late stage, and highly dependent on the actual setup: db server access speed,...

We're looking for a more fundamental solution:
How can we mitigate this dependency between the Domain and the Data Access layer?
How can we make sure we don't forget Includes?

Making multiple and specific methods on the repository is considered as well, but it doesn't appear to scale well:

This example might then use something like "GetOrderWithOrderLinesById" which contains the Include(o => o.OrderLines).
But to decide which method is needed, we'd have to review the implementation of CanBeExtended in the Domain layer.
Additionally, when that implementation changes, the repository method needs to change as well (e.g. to add extra Include statements), or another, new method would turn out to be needed, thus highly coupling the Repository (in the Data Access layer) with the Domain layer.

Fowler nor Evans seem to touch this specific topic AFAIK.

Any and all thoughts welcome!

Flater · Accepted Answer · 2019-10-10T11:18:32.510

First of all

A relevant answer of mine I think you should read.

Your question is based on your interpretation on the fundamentals of repositories with EF, and that interpretation is not complete (that's not to say mine is provably complete either). This leads to the issues you're uncovering.

However, it requires you to change your interpretation, rather than try to solve the issue the way you're trying to now. I strongly urge you to reconsider your architectural approach, because a more abstracted variant will make this problem a lot easier to tackle.

Answering the actual question

Never forget the responsibilities of each layer.

The DAL stores and retrieves data for you.
The BLL performs operations and adheres (and forces other to adhere) to the required business rules, which may or may not involve speaking to the data store.
The Service layer (as per your naming) is the public interface which gives users access to the BLL operations.

Problem: Depending on the implementation of the Data Access layer, the Domain layer behaves wrong: depending on whether you Include the Orderlines of the Order in GetOrderById, the CanBeExtended behaves differently.

TL;DR The DAL is a dependency of the BLL. The BLL cannot question the DAL's responses as the BLL's actions are based on the responses received from the DAL.

The core issue here is that you are making the BLL responsible for making a judgment call (can this order be extended?), but you are not giving the BLL the autonomy to verify the information it needs to make that judgment call.

The way you currently have it, the BLL's validator can be fooled by passed a bogus list of orderlines into it. The validation is supposed to work as a line of defence which prevents storage of data that violates business rules, and as you've pointed out yourself this line of defence can be easily circumvented. The simple truth is that your line of defence is ineffective.

If the BLL is expected to be the judge on whether the order can be extended or not, the BLL must be able to check the database to count how many lines currently exist. Without it, the BLL cannot make this judgment call.

In other words, you expect something along the lines of:

public bool CanOrderBeExtended(Order order)
{
    var existingOrderLineCount = unitOfWork
                                     .OrderLineRepository
                                     .GetByOrderId(order.Id)
                                     .Count();

    return existingOrderLineCount < 10;
}

And there you have your answer. The BLL will not behave differently based on the data an external consumer passes to it. The BLL will protect the datastore by verifying the datastore itself.

Note that if the Order object is fetched by the BLL itself and the DAL method that was used to fetch the order already guarantees that all existing order lines are fetched as well (e.g. GetOrderWithAllOfItsOrderLines); then the validation does not need to access the database again, because the DAL itself has already guaranteed that the current list of orderlines is correct.

Something like this would also be correct:

public OrderDto GetOrderById(int orderId)
{
    var order = unitOfWork
                    .OrderRepository
                    .GetOrderWithAllOfItsOrderLines(orderId);

    var result = new OrderDto()
                 {
                     Order = order
                     CanBeExtended = order.OrderLines.Count() < 10;
                 };

    return result;
}

The BLL relies on the DAL to provide the actual contents of the datastore.

But what if the datastore doesn't return all orderlines?

If the datastore is not performing its job (e.g. it only returns one orderline even though there are 15 order lines in the database for this order), then the BLL cannot be expected to protect against that. That is an issue you have to detect (via testing) and solve in the DAL.

The DAL is a dependency of the BLL. The BLL depends on the DAL to be correct. If the DAL is not correct, the BLL cannot be expected to know better than what its own dependencies tell it.

Your proposed solutions

Lazy loading is considered, but it's not our favorite. It's a technical detail of EF6, and it might turn out to be problematic in a late stage, and highly dependent on the actual setup: db server access speed,...

If you lazy load from the BLL, that means you are leaking IQueryables. DO NOT DO THIS. Leaking IQueryables outside of the DAL means that you've compromised the separation of your layers.

The issue you're highlighting (a second DB call) further compounds the problem but it is not the main decision point as to why you shouldn't be doing this. There are valid solutions which do use a second db call, which might not be what you need in this particular instance but are a better solution than lazy loading nonetheless.

Making multiple and specific methods on the repository is considered as well, but it doesn't appear to scale well

It scales as well as you can reasonably expect it to.

In this context, you stating it does "not scaling well" seems to imply that you are expecting a generic solution to load any related entity for any entity you're currently dealing with. That's not a reasonable expectation as it compromises the DAL's internal responsibility.

The DAL decides what data gets fetched from the database. The BLL does not decide that. At best, the BLL is allowed to communicate to the DAL using methods which the DAL expliticly exposes.
Simply put, the DAL decides to allow consumer to ask it to retrieve the order lines (= the DAL exposes a public method), and then consumers (the BLL in this case) are able to use this public method to do the thing the DAL has allowed them to do (fetching the order lines).

Allowing the BLL to dynamically alter the fetched data without having an explicitly defined DAL method implies that you are either leaking your IQueryables (as mentioned before, do not do this), or your DAL has some specific publci method which you've created for this purpose (which avoids IQueryable leakage).

The latter inherently means that you've custom built a DAL method that the BLL calls, which according to you "does not scale well". Therefore, by elimination, the only way in which you could have a method which "scales well" is by leaking the IQueryable, which you should not do.

In short, your expectation of how scalable something needs to be is leading you to compromise the separation of your layers. If you want to uphold the separation of your layers, the DAL needs to explicitly expose custom methods (i.e. written by the DAL developer, not just passed on from the EF library) which allow for the retrieval of related entities.

But to decide which method is needed, we'd have to review the implementation of CanBeExtended in the Domain layer.

This is a vary confusing point your making.

I think you're putting the cart before the horse. The DAL doesn't include the order lines just because the BLL wants to validate the data. The DAL includes the order lines because it has either privately decided or is explicitly asked to include them.

Why the DAL was asked to include them (by the BLL) is irrelevant as far as the DAL cares. Yes, the BLL may have asked the DAL for the order lines specifically to validate some business logic, but the DAL doesn't need to know that. All it needs to know is that it was asked to fetch the order lines from the datastore, and therefore it returns the order lines from the datastore.

That is the only responsibility of the DAL. It does not care about your business logic at all, because the DAL lives one layer deeper than the business logic.

We're clearly thinking backwards judging from your -great- answer. I've some follow-up questions. You say: "the BLL asks the DAL for the order lines specifically to validate some business logic". This clearly solves the issue. But what about the navigation property `OrderLines` on BBL object `Order`, which EF could have filled in beforehand (when asked with an `Include` statement)? It seems this OO idea isn't valid in this context? — jan, Oct 11 '19 at 08:56
About your example `public bool CanOrderBeExtended(Order order)` (BBL for the record?). On what (kind of) object is this an instance method? We would write it like this: `public bool Order.CanBeExtended()` Does this differ fundamentally? — jan, Oct 11 '19 at 08:58
@jan: What the DAL does internally (EF or otherwise) is the DAL's business, the BLL doesn't care. What the BLL cares about is what the DAL claims it can provide to its consumers (i.e. public methods). If the DAL exposes a "get order with its lines" method, the BLL has no interest in whether the DAL fetches that data in one query, two queries, or a hundred queries. It matters not. What matters is that the expected behavior (returning an order with its lines) is carried out. — Flater, Oct 11 '19 at 09:00
@jan: Where you put that method can vary based on your approach (e.g. anemic model, fat model, DDD, ...) and isn't really important to the issue at hand. So long as the BLL performs the expected validation (no matter which architecture you use), the same general logic applies. Try to separate the theoretical from the practical. My example wasn't a practical "you should do it exactly like this", but more of a theoretical "something like this" example. — Flater, Oct 11 '19 at 09:03
@jan: I'm not trying to dodge an answer here, but I am avoiding pigeonholing you in one particular implementation when in reality there are still many ways to implement this based on the architecture/framework that is most relevant to your application. But in all cases, the BLL should not be concerned about the DAL's internal workings. The BLL can depend on the DAL pretty much without questioning its correctness. If the DAL is behaving incorrectly, that's the DAL's problem and the BLL shouldn't try to correct/cover this. — Flater, Oct 11 '19 at 09:07
In general we seem to have the lifetime of our objects backwards: we have: 1. The Service layer asks the DAL for a BLL/Domain object: `Domain.Order order = orderRepository.GetOrderById(orderid);`, 2. we then query that BLL/domain object: `var result = order.CanBeExtended();` We firstly retrieve using the DAL, and next use the BLL/domain object. Our BLL itself isn't a consumer of DAL, it _relies_ on an external party ("Service layer" in our case) to ask the DAL the _right questions_. — jan, Oct 11 '19 at 10:54
Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/99775/discussion-between-jan-and-flater). — jan, Oct 11 '19 at 10:55
This is a complete misunderstanding of clean architecture. You are conflating the dependency graph and control flow (and as a result the purpose of IoC). So I will explain: First we want, at *compile time* a dependency graph that looks like `Service -> BLL <- DAL`, because this configuration keeps our BLL dependency-free (**so it can be tested**) as well minimizes the overall dependency graph. Any other configuration either adds a dependency to our BLL *and/or* increases total coupling. Second we want, at *runtime*, a flow of control that looks like `Service -> BLL -> DAL -> BLL -> Service`... — user3347715, Oct 13 '19 at 20:41
... for the reasons you explained in your comments. The problem is that these are at odds with one another (`BLL <- DAL != BLL -> DAL`) We introduce IoC to solve this problem. That is, we resolve `BLL -> DAL` at *runtime* using an interface. In concrete terms, it means that our `Repository` interfaces are part of the `BLL` and the implementations are part of the `DAL`. In this way we can achieve clean architecture. @jan. Your architecture is sound. The problem you are facing is that you have not discovered an appropriate model. — user3347715, Oct 13 '19 at 20:41
@king-side-slide: I feel you may have missed the point that what OP calls a "service" is the top-level application (e.g. ASP.NET MVC, Web API, ...) and not a backend service object (to lower confusion I've tried to refer to it as a "web service" instead). The dependency graph you mention in your first comment is impossible when it is the (web) service that is the originator of the runtime (e.g. receiving the user's web request). Also, the flow of control you post is exactly what I continued on in chat, see link above your comments. — Flater, Oct 13 '19 at 21:31
@Flater I understand to what the `Service` is referring. For example a "service" is a controller method. Said another way, the ASP.NET *framework* is the code that surrounds (frames) the BLL. I'm afraid I don't see how this relates to my point. I'm aware of the flow of control you outlined (I read the thread) and it is correct. The point I am making is that BLL should not depend on the DAL. It makes testing a nightmare, and it's avoidable using IoC. — user3347715, Oct 13 '19 at 21:40
@king-side-slide: I never stated that contracts shouldn't be in there. The main focus of the answer is that in OP's example, the Service depends on both the BLL and the DAL and is the middle man for any interaction between DAL and BLL; so I focused on the fact that the DAL needs to be behind the BLL (that is, the service does not interact with the DAL at all - not via contracts, not directly - but only with the BLL which in turn talks to the DAL). I agree that layer interactions should happen via interfaces/DI and not direct dependency, but that simply wasn't the point I was trying to correct. — Flater, Oct 13 '19 at 21:43
@Flater By simply moving the `Repository` interfaces into the BLL OP can achieve the correct dependency graph. Your examples, both above and in the thread with OP, show the BLL directly interacting with the DAL. *This* is what I'm attempting to put finer point. OP *actually has* a sound architecture if they move the `Repository` interfaces (though this is still largely academic). Asking them to *inject* a dependency into the domain model is unnecessary. — user3347715, Oct 13 '19 at 21:50
@king-side-slide: You're inferring things that are not confirmed. You have no way of knowing what the type of `unitOfWork.OrderRepository` is (it's an `IOrderRepository` though that was never explicitly elaborated on _since it's not the focus of the answer_). Putting the repository implementation in the BLL does not work because that means the DAL must therefore be exposing the db context class (to be injected in a BLL repository) which inherently means exposing EF's base `DbContext` class which in turn means you'd be leaking EF into the BLL. — Flater, Oct 13 '19 at 21:57
@king-side-slide: I'm not saying your way of doing it is wrong but it is vastly more different from my answer than you think, and you're trying to correct a wrong interpretation of what my answer represents. I suggest you write an answer of your own to showcase your suggested approach to OP as a comment cannot do it justice. — Flater, Oct 13 '19 at 21:59
@Flater `Repository` interfaces are in the BLL not implementations. For what reason should a method in the `Order` entity need to issue a query to *any* `Repository`?. Again, there are three concrete reasons to avoid this: performance cannot easily be optimized if hydration of the domain does not occur at the top level, we are adding a dependency to our domain model, and it cannot be easily tested (now we need some sort of mock `IRepository` in order to test a single `int`). I have posted an answer outlining my position. — user3347715, Oct 13 '19 at 22:15
@king-side-slide: This is not a discussion forum. The questions you're asking have already been answered in my answer/comments/chat to the degree that is relevant to OP's question. I disagree with your corrections and am not going to change my answer to fit your idea, which may be sound but is not in line with the solution I've suggested. — Flater, Oct 13 '19 at 22:22
@Flater Fair enough. The impetus for my comments was for the benefit of OP. — user3347715, Oct 13 '19 at 22:43

user3347715 · Answer 2 · 2019-10-14T16:40:16.083

Your architecture is sound.

The root of the problem here is that... well... you haven't actually defined a domain model. In many ways your question can be rephrased into something like, "How can I make sure developers write the correct code?". Well... you can't.

What I am seeing here is the tension created by using a data model (EF) as a domain model. You have encountered a mismatch between these two models. Clearly you cannot have a domain entity that contains a property (CanBeExtended) that only sometimes returns the correct business answer when queried. Think about this. You have a business rule that says, "An order can only be extended (returns true) if it has less than 10 order lines" and model that does not always enforce that invariant.

Either you must include the entire OrderLine collection every time an Order is hydrated, or you must include some additional aggregation data (e.g. int TotalNumberOfOrderLines) that can be used to enforce your invariant independent of the number currently in the collection (this option is also more performant). It really is as simple as that.

Depending on whether or not you can leverage EF to carry this our for you, you may need to put a stop to ad-hoc queries and encapsulate Order retrieval into an explicit domain Repository in order to make sure hydration is done sufficiently to enforce your rules. At this point you will have realized a first step in the separation between the data model and domain model.

Importantly, and this is in reference to your conversation with @Flater, the domain model does not need to directly reference any Repository instance. Not only does this make testing far more difficult than necessary (and create a dependency), it is always possible to know which data is necessary to carry out a use-case at the Service level:

// overview of service flow

Get data (model) necessary to carry out use-case
Coordination data (model) to fulfill objective
Persist data (model)

This allows for better performance because you will know what data is necessary for step 2 at step 1, and can therefore issue top-level queries (which can then be optimized) in preparation for step 2.

Nachiappan Kumarappan · Answer 3 · 2019-10-13T08:36:56.413

This is simply a bug and we need to see how to fix it and avoid it in the future. But let's see if there is a compiler time coupling problem

I mostly agree with @flater's answer. I am starting it in a different language to provide a different view

In your design order and order lines are entities. Order is also the aggregate. So in domain driven design lingo what it means is

Your repositories will NOT be able get order line by id
When ever you want to modify an order line, you have to get the associated order from the repository and modify and order line associated with it, and then save the order.
Your repositories should have methods to get order and save orders
Your repository should get the full order when the order is got.

Now you rightly have a repository that produces an order. However if you miss to code the Include(o => o.OrderLines) (or don't use lazy loading for OrderLines) in the repository, then you have repository that creates an aggregate that is incorrect. (You have created an Order without any OrderLines when it had OrderLines)

Now I would interpret your question this way: "How to code the repositories without bugs and ensure bug is not introduced by mistake?". My suggestion would be to unit test the repository. Depending on the database(SqlSever, PGSql) you are using, you can use nuget packages for creating mock database servers. Example https://www.nuget.org/packages/Postgres2Go/

Another point to consider is Your domain should not depend on repository at compile time, but you domain will depend on repositories at run time. So if you have a bug in your repository, your domain object might not be produced correctly as in your case.

My question was mostly about how to properly establish what _"the full order"_ exactly is. Including OrderLines makes sense: you count them later in `CanOrderBeExtended`. But if the implementation **in the BLL** later changes so that some info about a _Customer_ is needed also, we need to adapt the "GetFullOrder" method **in the DAL** to also `Include` the Customer entity. The answer by @Flater makes it clear the BLL itself should be able to askfor all necessary data, instead of relying on another party ("Service layer") to ask that for him. Service -> BLL -> (abstracted) DAL — jan, Oct 15 '19 at 09:31
In Domain Driven Design, in the tactical design, there is the idea of the aggregate. The aggregate is a transactional boundary. Now you should be able to get an aggregate from the repository or save an aggregate in the repository. One aggregate refers to other aggregates by id. Aggregate is nothing but one or more entities (Order and OrderLines - in your case). The aggregates is level where we implement the invariants. (continued in next comment) — Nachiappan Kumarappan, Oct 16 '19 at 05:10
Now when order is you aggregate, and customer is not part of it, you can not have checks in the order object about customer. If you want to have any checks, they must be implemented in domain services. When OrderLines is part Order aggregate , then it should be fully constructed when got from the repository — Nachiappan Kumarappan, Oct 16 '19 at 05:13

DataAccess Layer coupling with Domain Layer

3 Answers3

First of all

Answering the actual question

Your proposed solutions