0

Let us say I have an entity House that I want to store on a database. The House has an attribute yearBuild. If the house was build before 1990 its status should be 'ToBeSelled'. Now I have two ways to model this

a) store the status in the database as attribute(metadata) or

b) extract the status on runtime each time I am requesting the House .

So the actual question is : Should I store metadata or create them at runtime ? I really don't like storing them as their maintenance is difficult as you have to handle 'stored states' but on the other hand creating them on runtime may make them more difficult to be comprehensive it terms of code readability mainly . What is the best practice in this scenario ?

In my scenario I might need somewhere else in the code an if check like

if house.status == 'TobeSold' 

but still the status can be generating solely from the base entity .

I also don't care too much about performance .

Cap Barracudas
  • 1,001
  • 9
  • 17
  • If the rule changes tomorrow, so that all the houses after 1992 are ToBeSold, do you want the houses that you calculated yesterday that were built in 1991 to stop being ToBeSold, or do you want them to continue being ToBeSold and only the newly entered 1991 houses won't be? – user253751 Jun 17 '22 at 16:46
  • I want them to stop , that is why I am saying it is difficult to manage states . – Cap Barracudas Jun 17 '22 at 20:51
  • 1
    then calculate it at runtime – user253751 Jun 20 '22 at 08:51

5 Answers5

3

Most modern databases will allow you to have your cake and eat it too. You can, in the databases, code up business logic like this as a virtual column. This will let you have a column that is calculated based on other attributes (yearBuilt < 1990) and is calculated on the fly, but which you can treat as a column. This means you don't have to maintain it and can centralize and the logic.

Database support for virtual columns include MariaDB, SQL Server, Oracle, PostgreSQL, SQLite and Firebird. Databases that don't have this feature usually support something similar in their DB views.

Egret
  • 414
  • 2
  • 7
2

This doesn't sound like metadata but business rules (pretty inflexible in this example). Business rules should normally be evaluated when you need their results, although it sometimes makes sense to cache the results if the computation is expensive.

In your example, it's cheap, and there is no reason to store the value in the database.

Hans-Martin Mosner
  • 14,638
  • 1
  • 27
  • 35
  • Thanks . Even if the business rule is very complex , such as 20 conditions ?( I suppose so, right ). Also what makes you say these are not metadata ? – Cap Barracudas Jun 16 '22 at 12:03
  • It's a matter of how they are evaluated. Technically at runtime it is metadata, but storing it is a matter of whether there are rules about _when_ the evaluation should happen, or if you need to optimize displaying the data due to the time it takes to calculate. – Berin Loritsch Jun 16 '22 at 12:40
1

Store the metadata on the database.

You will want to use the database for filtering and reporting. ie "how many houses are in status "ToBeSold"?" You cant loop through them all loading them into your House object and keeping a count and you shouldn't put the status function into your datalayer.

There is also another reason.

Business rules like "houses older than the 90's should be marked 'ToBeSold'" seem simple but will soon evolve beyond a simple function of the House data.

How about; 'houses built before the 90's should be in state "ToBeSold" unless they have previously been in state "BoughtForLongTermInvestment"'

Or perhaps '... or have been manually overridden by a user'

To implement these rules as functions you will have to add more stored metadata, when you might as well just store the calculated result.

Ewan
  • 70,664
  • 5
  • 76
  • 161
1

In my experience (and I like a lot to calculate these things on the fly) the answer to the question is driven by questions like

  • How often do I need to query (the database) by toBeSold?
  • In how many places do I need to know yearBuild < 1990 = toBeSol ?
  • How many people (dev team or not) need to know these sorts of rules?
  • How often do I have to reason about yearBuild < 1990 = toBeSol?

Concerns about performance aside, saving the value has a "practical" aspect. From the development point of view, you don't have to code that knowledge in different locations1. It's solved once (before the insert or the update) and everything coming after doesn't need to know the business rules the value comes from or how to calculate the value.

As a business rule, it's worth coding it "once" and making it part of the source code, not the database. In other words, don't solve yearBuild < 1990 by SQL or database built-in functions every time you search or fetch houses because it can turn easy queries into a nightmare.

If you were a methodic developer (probably you are), you are likely to be consistent in your solutions. You will solve the same problem in the same way all over the code. While yearBuild < 1990 = toBeSol is a fairly simple condition, this same solution involving fields from other tables will make it impractical if you ever need to solve the value toBeSold by SQL. Or in a non-programmatic way.

Finally, don't force developers to keep in mind these sorts of rules, because they cause a cognitive burden and, for some people, can be stressful. Coding thinking in other developers (not only in you) usually results in simpler and cleaner code.


1: modules, backend, frontend, SQL queries, remote consumers you know nothing about and they know nothing about your business rules, reports, etc.

Laiv
  • 14,283
  • 1
  • 31
  • 69
0

You have to choose according to which is the minimum of maximum complexities in terms of Big-O notation. One if on the fly executed lots of times is actually linear so O(n). You should estimate how much does the complexity cost while retrieving from the DB and compare both values. You should also take into consideration your sources in the scenario of scale up. Meaning is it cheaper in terms of resources to allocate more Hard Disk memory to the DB (in storing scenario), What's the read/write/retrieval cost and how much transfer bandwidth do they need ? On the other hand, What's the upper CPU capacity to process in the if O(n) scenario and how much more costly is to use cache (RAM) to keep most used values

GGEv
  • 101
  • 2