10

I know that triggers can be used to validate stored data to keep database consistent. However, why not perform validation of data on the application side before storing them into the database?

For example, we store clients, and we want to perform some validation that cannot be easily done on DDL level. https://severalnines.com/blog/postgresql-triggers-and-stored-function-basics

Another example is audit.

Update

How triggers and database transaction work together. For example, if I would like to perform validation of data being inserted. It is done inside of a transaction. What happens earlier: transaction is committed or trigger is executed?

Yan Khonski
  • 211
  • 1
  • 8
  • `However, why not perform validation of data on the application side before storing them into the database?` well, these two are not mutually exclusive. It's likely you will validate different things on both sides. While the validations on the application side are business-centric, the validations on the database are more data-centric. Think in a database fed by several and different applications. – Laiv Feb 14 '19 at 08:37
  • Will you need to insert data from an outside source that is just doing a mindless dump of data into your system? If so, you might have cases where you need the data inserted in an invalid format for correction later. In this case, triggers will cause endless problems. Just bear this in mind. Sometimes you *don't* want a trigger executed. – Greg Burghardt Feb 14 '19 at 13:34
  • Your update should be a question on its own, maybe on SO, but your question has already an answer on [Database Administrators.SE](https://dba.stackexchange.com/questions/153525/what-is-the-difference-betwen-after-update-and-before-update-in-postgresql). In short, even an "after update" trigger is executed before the transaction is commited, and if an exception gets thrown inside the trigger, it will cause a rollback. – Doc Brown Feb 14 '19 at 14:22
  • @GregBurghardt Most databases have statements that can be used to disable triggers for those kinds of activities. – Blrfl Feb 14 '19 at 15:22
  • 1
    @Blrfl: Yes, but you need to be aware that those triggers could be disabled temporarily for all connected users, when you only want to conditionally disable those checks for the current session. – Greg Burghardt Feb 14 '19 at 16:28
  • @GregBurghardt I'm well aware of that. In databases where DDL is transactional, the disable/dump/enable cycle can be fully isolated. Even without that, you can still build logic into triggers to ignore certain transactions. – Blrfl Feb 14 '19 at 16:40

5 Answers5

12

It depends on what kind of application system you are building:

  • if you are creating an application-centric system which contains just one main application, with a dedicated database specifically for this application, and ideally one team responsible for evolving application and database side-by-side, you can keep all validation logic and also audit logic inside the application.

    The main benefit of this is that you do not have to distribute the business logic between application and db, so maintaining and evolving the system becomes often easier. As a bonus, you do not tie the application too much to a specific type of DBMS or DBMS vendor. This approach is obviously required if your application wants to be able to use a lightweight DB system which does not provide triggers.

  • If, however, you create a system where many different applications share a common database, and it cannot envisioned beforehand which applications will write to it in the future, or which teams will develop applications for filling data into the db in the future, then it is better your database will be responsible for guaranteeing as much data consistency as it can. And that is where triggers get really helpful. In larger systems, referential constraints are often not sufficient for this, but a trigger which calls a stored procedure can implement almost any kind of validation you need.

Another reason for using triggers can be performance: in complex data models, it is not uncommon to encounter complex consistency rules which require to use a lot of additional data which is not part of the current working set available at the client application. Transfering all those data over the network first for making validation possible at the client side can have a notable performance impact.

See also this older SE post: Application Logic Vs DB Triggers for database cleaning

So decide for yourself what kind of system you are building, then you can make a founded decision if triggers are the right tool for your case or not.

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
3

I think the question is about responsibility for quality of data.

The answer depends on how you see the system.

If you see the database as an independent, distinct, and autonomous service separate from the application, then the database is responsible for ensuring the consistency and quality of the data it contains. Essentially because that database could be used by a different application, so it cannot rely on that second application having the same consistency and quality behaviours. In these circumstances the database needs to be designed to expose an API and autonomous behaviour. In this view there are at least two applications, one of them is the database and the other is the application using it.

Conversely the database could be considered a complicated form of file that is under the direct and total control of the application. In this sense the database devolves to being a pure serialisation and document navigation tool. It may provide some advanced behaviours to support query, and document maintenance (like JSON, or XML tools do) but then again it does not have to (like most file streams do). In this case it is purely the programs responsibility to maintain the correct format and content within the file. In this view there is one application.

In both views the next question is how to support the usage of the database as either a fancy file, or a separate service. You could achieve this by:

  • using the tools that the database platform provides in the form of tables/views/stored procedures/triggers/etc...
  • wrapping the database itself within a service that all clients must use in order to access the database
  • wrapping the database in a library which must be used by all clients in order to access the data.

Each comes with its own pros/cons and will depend upon the architectural constraints of the environment the system operates within.

Regardless of which view you take it always pays to validate data at boundaries.

  • Validate the fields on a UI that a user enters
  • Validate the network/API request before it leaves the client
  • Validate the network/API Request in the server before doing anything
  • Validate the data being passed into business rules
  • Validate the data before being persisted
  • Validate the data after being retrieved from persistence
  • so on and so on

How much validation is warranted at each boundary depends upon how risky it is to not validate it.

  • multiplying two numbers together?
    • you get the wrong number is that a problem?
  • invoking a procedure on a given memory location?
    • What is in that memory location?
    • What happens if the object does not exist, or is in a bad state?
  • using a regex on a string containing kanji?
    • Can the regex module handle unicode?
    • Can the regex handle unicode?
Kain0_0
  • 15,888
  • 16
  • 37
  • Centralizing validation logic is good, but imho triggers are not a good way to implement that. I used to work on a system where multiple applications all shared a database, with all validation logic and side effects implemented in the database through triggers and stored procedures. I came away with the impression that it's better to have a microservice in front of the database and implement all logic there. Non-trivial logic inside a SQL database is an anti-pattern. – Joeri Sebrechts Feb 14 '19 at 08:14
  • 1
    @JoeriSebrechts Okay, I'll bite: _why_ is nontrivial logic in a database an antipattern, and what makes it more of an antipattern than putting it in a separately-maintained program? – Blrfl Feb 14 '19 at 14:13
  • @Blrfl Two reasons, the API for accessing the logic in the DB is inferior to a web service API (harder to version, harder to use, not easily cached, ...), and databases make it harder to cleanly structure and maintain the codebase hosted inside them. In my experience it's easier to host the logic in a web service in front of the database than inside that database. – Joeri Sebrechts Feb 15 '19 at 08:05
  • @JoeriSebrechts I concur that most Database platforms provide woeful tools for implementing a credible, useful, and develop-able API. In many ways it is certainly an invitation to feel a lot of pain. My point was that it is about perspective, realising that the DB is a fancy file, or that it is truly a separate service leads to the next question which is how to be wrap that to support that style of usage. I'll elaborate my answer to be clear on that. – Kain0_0 Feb 17 '19 at 23:32
2

No, you should never use triggers to do validation.

The database is only responsible for its own integrity. Any user facing validation should be performed by your application.

Databases perform three levels of validation for integrity. The first one is field level validation. A field can be required, if there is no value (null) it is an error. It can also be a check constraint; a domain has an enumerated number of values.

Secondly there are relations between tables. In one table you store one or more foreign keys, relating this table to other tables and requiring the values to be valid keys for the "other table". Think of a address database, where we support addresses of different countries. A country key in an address must point to a known country. Whether the data (e.g. a postal code) is valid, is not a concern of this integrity check.

Thirdly and most complicated are triggers. As a general rule these should address (pun not intended) concerns integrity rules that are conditional. To come back to the address example: if a country does not have postal codes, it would be a problem if a country in this list would have a postal code. So the check would be: if this country does not have postal codes, the postal code field should be null.

Validation is the concern of the application. The fact that a German postal code consists of only digits is a check the application should make, not the database. The line is a thin one, so you may need some thinking/discussing in some cases if something should be in a trigger (protect integrity of your database) or in the application (user facing validation).

Menno Hölscher
  • 195
  • 1
  • 1
  • 4
  • Just wanted to add that if the OP needs to add acomplex validation rule that needs to be in the database he can always use stored procedures as a safer alternative. – Borjab Feb 14 '19 at 17:48
  • @Borjab : Validation as in keeping the database correct, maybe. But user facing validation? No. – Menno Hölscher Feb 14 '19 at 21:13
  • 1
    Your first statement says *"never use triggers to do validation"*, but below you write, "yes, you can use triggers for certain kinds of validation, and it is not inherently clear where to draw the line". This reads quite contradictory. I would recommend to delete your first sentence, that would improve your answer heavily. Oh, and your last sentence does not answer the OPs update question, since "before/after" has nothing to do with the transaction. I would recommend to delete it as well. (see my comment below the OPs question). – Doc Brown Feb 16 '19 at 10:17
  • @DocBrown The distinction is between protecting the database from corruption and user facing validation. So in "any further validation" I refer to user facing validation. How could I make this distinction more clear? As a start I removed the "further". – Menno Hölscher Feb 16 '19 at 13:41
  • 2
    It is perfectly fine to do validation in the database. Just as it is fine to do it in the application. Both have their advantages. Doing your validation in the application means you have to be super careful every time you run SQL without your ORM which is needed for just about every complex app. – Qwertie Feb 18 '19 at 05:50
  • @Qwertie I clearly think you should not. Separation of concerns is dear to me. Btw, if you have to run SQL __many__ times on a live system, it may be time to look into development and testing processes? – Menno Hölscher Feb 18 '19 at 10:58
  • @MennoHölscher Active record is not nearly powerful enough to replace SQL and something as simple as getting records with a date older than x requires writing sql – Qwertie Feb 18 '19 at 22:21
1

Because the question is about if we really need triggers for relational databases here are some other use cases where to use triggers:

  1. For auditing as described in the other answers.
  2. Auditing in the wider sense: if a database entry is changed a trigger can record the event for asychroneous post processing, e.g. nightly exports to another application.
  3. Triggers for views: triggers can be defined instead of. With this mean one can insert, update and delete entries from a view. The triggers can spread these actions onto multiple tables. This is a way to make a restricted view available without exposing the details of the underlying tables.
  4. To explicitly save database turn arounds: assume a N:M relations between table A and B and a intermediate table R. You can define foreign key constraints from R to A as well as B specifying that the entry in R is to be dropped if its corresponding entry in B is deleted. However, the business logic requires sometimes that entries in A must have at least one relation to an entry in B. In this case a trigger on deletion of R can help to enforce this logic: if for an entry in A the last entry in R is deleted then the trigger can delete A. In the application centric view at least two turn arounds are necessary. This is an example for validation. Other examples are conceivable: beside use cases (1), (2), and (3) where the triggers save turn arounds too assume the case where a user is inserted into a user table and a set of defaults in a preference table is to be generated.
  5. Trust: sometimes database admins change entries on the command line not using your application. Admins work carefully and know what they do. However, sometimes they might be wrong. If consistency is critical a trigger is their safety belt.

As a drawback business logic is distributed between the layers and this is a major disadvantage for maintenance. As another author wrote, it is a thin boundary between application and database and the choice is not always clear. My personal opinion is that triggers place a burden on developers. They may save time in development. Definitly they enhance user expirience because they boost performance over slow network connections.

Claude
  • 183
  • 1
  • 7
0

Audit is a classic example of the use of triggers effectively. I've found some errors made by the tester (moving a client from one level of service to another) thanks to an Audit table which was implemented by triggers. I highly recommend using triggers for audit.

Validation could be done in the front end level, but I've seen weird errors in database that I've handled (people who were born in 3000, etc), and since some of them I made myself I highly recommend having an extra layer of validation in the database, just in case. Of course, those types of errors could be avoided with check constraints, and many times they are more effective (in MS SQL they are the preferred method; always check the documentation).

Hila DG
  • 109
  • 2