Ways to have a history of changes of database entries

Question

What are ways to allow the versioning of database entries (data)?

Think of the content-managment-systems abilities to revert back changes of articles.

What are their pros/cons?

What exactly do you want to version? The schema or the data? — tdammers, Jul 10 '12 at 05:37
I want to version the data. To stay at the example of cms, lets say the versions of _articles_. — matcauthon, Jul 10 '12 at 05:39

jmoreno · Accepted Answer · 2019-02-13T11:15:08.310

30

There are basically two approaches: an audit table, with all previous values stored in it, or include a start/end date as part of the table, and all updates create a new record while closing out the old one.

Update: SQL SERVER 2016 supports this as a design pattern/table type — https://docs.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-2017

edited Feb 13 '19 at 11:15

answered Jul 10 '12 at 06:07

jmoreno

10,640
1
31
48

5

So the first approach might be more scalable. As the "archived" data will be rarely accessed, the database design could be optimized. And the _working_ table keeps small. Depending on complexity, it should also possible to save only diffs. Is it advisable to use the _memento pattern_? – matcauthon Jul 10 '12 at 09:22
1

That will depend on your usage, it may be enough to use triggers to populate the table/s and then provide a way of picking what and how far to rollback. – jmoreno Jul 10 '12 at 15:02
You have a typo in your answer (patter should be pattern) – geocodezip Feb 13 '19 at 02:04

Yusubov · Answer 2 · 2019-09-17T20:36:41.823

16

One idea is to use "Insert-Only Databases". The basic idea is that you never delete or update data on a row.

Each table that is required to be tracked will have two datetime columns from and to. They start with the value NULL in each (beginning of time to end of time). When you need to "change" the row you add a new row, and at the same time you update the to in the previous row to Now and the from in the row you are adding to Now.

For more detailed info look at:

This technique is called AuditTrail to manage legacy data, and its kinda stores change history.

It looks like question of this nature is already posted:

edited Sep 17 '19 at 20:36

answered Jul 10 '12 at 09:51

Yusubov

21,328
6
45
71

Sadly that question looks to have been deleted :( – Douglas Gaskell Feb 16 '16 at 02:50
No problem, here is the [link](http://stackoverflow.com/q/1051449/1437962) . Another good design suggestion in [link](http://stackoverflow.com/q/201527/1437962) – Yusubov Feb 24 '16 at 15:56
My gut feeling regarding keeping the `from` and `to` fields in sync is expressed well in [this answer](https://dba.stackexchange.com/a/114738/215082): _Avoid what I call Row Spanning Dependency. That is where one field (End_Date) of a row must remain in synch with another field (Start_Date) of a different row. This makes working with the data more difficult and is an excellent source of anomalies._ – bluenote10 Jun 16 '21 at 06:52

score 4 · Answer 3 · answered Jul 10 '12 at 14:45

The method we use for versioning database entries is to use an auditing table. The table has a schema along the lines of:

Seq      - Int      ' Unique identifier for this table
Event    - Char     ' Insert / Update / Delete
TblName  - Char     ' Table that had field value changed
FldName  - Char     ' Field that was changed
KeyValue - Char     ' delimited list of values for fields that make up the PK of table changed
UsrId    - Char     ' User who made the change
OldValue - Char     ' Old value (converted to character)
NewValue - Char     ' New value (converted to character)
AddTs    - DateTime ' When the change was made

We then have triggers on Insert / Update / Delete of the tables that we want to track.

Pros:

All the data is in one table
Can be setup to track all fields or specific fields in a table
Easy to show versioning on each field for a table

Cons:

Having all auditing information in one table results in an extremely large number of records
Lots of triggers needed

score 2 · Answer 4 · answered Jul 10 '12 at 11:04

I think you can use triggers for each table and maintain the data in _history (or you can give any name) and on every insert, update, delete on main table will trigger your trigger and you can save the details in this table.Trigger mechanism is also available with SQLite database if you are using one.

This mechanism is useful for large projects as well. In this table you can log information of user who have made the changes along with the time-stamp of the changes. you then can restore your table to any of the time-stamp matching to your requirements.

Every Database has its own way to write and code triggers. If you are using SQLite visit SQLite.org for the syntax. For other databases you can visit their official sites.

score 1 · Answer 5 · answered Jul 10 '12 at 08:49

You're probably aware of Sqlite db engine. The whole db is saved in a single file. The api also supports virtual file systems so basically you can organize the storage anywhere and with any format, just respond to read and write operations at particular file offsets. Possible applications for this could be encryption, compression and so on. The best part of it that the container layer should not know anything about databases, sql or sqlite file format, just obey xRead and xWrite callbacks.

One of the ideas was to implement time-machine feature. So any xWrite operation saves every segment it would overwrite in "undo" history and the user can choose a date in the past to see what the db contained (probably read-only mode). I don't have working example yet (there was a discussion about it at sqlite mail list), but probably other engines supply VFS APIs so something similar is possible. And once it implemented, it should be compatible with database structures of any complexity.

What do you think is this approach scalable for larger projects? — matcauthon, Jul 10 '12 at 09:24
I think this could add large data overhead for big data changes (obviously since every single changed should be saved, although compression for older version can help). Apart from that from the point of you of your schema, as long as it works for two tables, it works for twenty. — Maksee, Jul 10 '12 at 10:11

Brad · Answer 6 · 2012-07-10T14:35:58.233

0

I'm doing a version of this now. for every record I have an Inserted Date, Modified date and and Active Record boolean flag. For the initial insert Inserted and Modified dates are both set to Now() (This example is in Access) and the Active record flag is set to true. then if I modify that record I copy the whole thing to a new record, changing the field(s) the user is changing, I leave the Insert date equal to the original and change the Modified date to Now(). I then flip the Active Record flag of the original record to false and the new record to true. I also have a field for ModifiedRecordsParentID where I save the identity of the original record.

Then If I even need to query I can just return records where ActiveRecord = true and I will only get the most up to date information.

edited Jul 10 '12 at 14:35

answered Jul 10 '12 at 11:57

Brad

617
5
14

No need for the `ActiveRecord` flag. The MAX(*) row should always be the current record. Restoring to a previous version simply inserts said row into the table again. – invert Jul 10 '12 at 14:19
I wasn't sure how to make the select work, but now that you're calling this out I'm thinking about it and have an idea, hmmmm – Brad Jul 10 '12 at 14:34
Usually MAX(column_name) selects the largest value in the table's column. To select the whole row, a simple `select top 1 order by id descending` will do. – invert Jul 11 '12 at 07:46
Yeah, that works for a simple single record but my table was a collection of child records which would need to be selected at once but could have been modified individually. Just a little more complex. – Brad Jul 11 '12 at 12:05

score 0 · Answer 7 · edited May 23 '17 at 12:40

0

also, if you want to store ALL changes to the DB over time, you might want to check out logging (https://stackoverflow.com/questions/3394132/where-can-i-find-the-mysql-transaction-log)

edited May 23 '17 at 12:40

Community

1

answered Jun 11 '15 at 01:02

Sabrina Gelbart

189
1
6

Ways to have a history of changes of database entries

7 Answers7

Linked