Should We Use Surrogate Primary Keys for Every Table?

Question

We are developing a data model for a marketing database that will import transaction, customer, inventory, etc. files and the directive is ONE process that works for every client. We have been told every client will have different import layouts and different columns that identify a table's primary key.

Our initial idea is to get definitions from the client on what columns make each record unique, store those in a mapping table, and then have lookup tables that translate those primary keys to an internal surrogate keys automatically for every destination table so that every table conforms to an integer primary key no matter how many columns/types are used to make up the real pk.

The first major problem I saw was when/if have to map back to the lookup tables to get data that we did not store in the main data model, but I have ben assured that anything we ever want to query will be duplicated in the lookup and main tables so that should not be a concern.

This kind of flexibility seems like it will cause some serious limitations on:

Technology stack (no way to dynamically map these import files in SSIS, need lot of dynamic SQL or java/c#)
Scalability (based on previous concern and initial testing, this would be difficult to scale without speed concerns)
Complexity (we are already running into some complex code changes when we try to implement all these tables with historical change logging while maintaining mapping to the lookup tables for example)

My question - Is this feasible or is there another obvious solution we are missing?

Oh, no. Not this again. http://programmers.stackexchange.com/q/204521 — Robert Harvey, Mar 25 '14 at 02:12
Would this question be a better fit for the [Database Administrators](http://dba.stackexchange.com/) SE site? — Bernard, Mar 25 '14 at 02:15
@Bernard: They don't have enough flame bait over there already? — Robert Harvey, Mar 25 '14 at 02:15
http://dba.stackexchange.com/questions/6108/should-every-table-have-a-single-field-surrogate-artificial-primary-key — Robert Harvey, Mar 25 '14 at 02:19
To clarify, this is for internal pks only to simplify the flows that will take place after this data is placed in the initial destination as we will have a large variety of clients and do not want to rewrite the blackened process everytime — markA, Mar 25 '14 at 02:56
Another point of clarification: this data only comes in and is only sent out after a large flow of validation analytics and decision processing so it only has to work internally to that end — markA, Mar 25 '14 at 02:57
That nightmare is the price of violating 3FN. Hope this helps: http://dba.stackexchange.com/questions/45956/how-to-migrate-data-between-all-surrogate-databases — Tulains Córdova, Mar 25 '14 at 15:03

score 3 · Answer 1 · answered Mar 25 '14 at 02:25

We have been told every client will have different import layouts and different columns that identify a table's primary key.

Dear lord in heaven, DO NOT let this go into production without establishing a clear standard process for giving each record of importance a single, system-wide, actual key. A key that is of the same format for all customers, that you can return their imports with the system key appended to if necessary.

User-specified columns for uniqueness is just asking for data problems, unless your clients are tech-savy--and if they're tech-savy, they can build their own db. Mammoth "map" tables are just a headache waiting to happen; what happens when your client eventually has two "John M Smith" customers, who both have the same phone number and address because a grandfather moved in with the grandson who was named after him?

As for "what to do with fields not added to our main data-set", this is what databases have XML support for. (Or JSON, or an EAV table.) XML is better than JSON in this regard, as most database types with an explicit "XML" data type also have ways to index and slice XML content. (assuming you don't have a database that speaks JSON, that is.)

Should We Use Surrogate Primary Keys for Every Table?

1 Answers1