First, I want to say this seems to be a neglected question/area, so if this question needs improvement, help me make this a great question that can benefit others!
In my experience, there are two sides of an application - the "task" side (where the users interact with the application and it's objects where the domain model lives) and the reporting side, where users get data based on what happens on the task side.
On the task side, it's clear that an application with a rich domain model should have business logic in the domain model and the database should be used mostly for persistence. Separation of concerns, every book is written about it, we know what to do, awesome.
What about the reporting side? Are data warehouses acceptable, or are they bad design because they incorporate business logic in the database and the very data itself? In order to aggregate the data from the database into data warehouse data, you must have applied business logic and rules to the data, and that logic and rules didn't come from your domain model, it came from your data aggregating processes. Is that wrong?
I work on large financial and project management applications where the business logic is extensive. When reporting on this data, I will often have a LOT of aggregations to do to pull the information required for the report/dashboard, and the aggregations have a lot of business logic in them. For performance sake, I have been doing it with highly aggregated tables and stored procedures.
As an example, let's say a report/dashboard is needed to show a list of active projects (imagine 10,000 projects). Each project will need a set of metrics shown with it, for example:
- total budget
- effort to date
- burn rate
- budget exhaustion date at current burn rate
- etc.
Each of these involves a lot of business logic. And I'm not just talking about multiplying numbers or some simple logic. I'm talking about in order to get the budget, you have to apply a rate sheet with 500 different rates, one for each employee's time (on some projects, other's have a multiplier), applying expenses and any appropriate markup, etc. The logic is extensive. It took a lot of aggregating and query tuning to get this data in a reasonable amount of time for the client.
Should this be run through the domain first? What about performance? Even with straight SQL queries, I'm barely getting this data fast enough for the client to display in a reasonable amount of time. I can't imagine trying to get this data to the client fast enough if I am rehydrating all these domain objects, and mixing and matching and aggregating their data in the application layer, or trying to aggregate the data in the application.
It seems in these cases that SQL is good at crunching data, and why not use it? But then you have business logic outside your domain model. Any change to the business logic will have to be changed in your domain model and your reporting aggregation schemes.
I'm really at a loss for how to design the reporting/dashboard part of any application with respect to domain driven design and good practices.
I added the MVC tag because MVC is the design flavor du jour and I am using it in my current design, but can't figure out how the reporting data fits into this type of application.
I'm looking for any help in this area - books, design patterns, key words to google, articles, anything. I can't find any information on this topic.