230

My t-sql teacher told us that naming our PK column "Id" is considered bad practice without any further explanations.

Why is naming a table PK column "Id" is considered bad practice?

yoozer8
  • 693
  • 1
  • 8
  • 20
  • 1
    Because it does not describe the column. **ID** is a vague description. – Ramhound Oct 17 '11 at 16:45
  • 28
    Well, it is not a description actually and Id means "identity" which is very self explanatory. That is my opinion. – Jean-Philippe Leclerc Oct 17 '11 at 16:48
  • 2
    For the record, even if it is a bad practice, it is a common practice. Professors live mostly in a world of theory. – riwalk Oct 17 '11 at 16:58
  • 50
    I'm sure there are plenty of shops that use Id as a PK name. I personally use TableId as my naming convention but I wouldn't tell anyone its THE ONE TRUE WAY. It sounds like your teacher is just trying to present her opinion as widely accepted best practice. –  Oct 17 '11 at 16:58
  • 23
    Definitively the type of bad practice that isn't that bad. The point is to get consistent. If you use id, use it everywhere or don't use it. – deadalnix Oct 17 '11 at 17:00
  • 8
    Perhaps you should first ask "Is the *really* a bad practice"? – FrustratedWithFormsDesigner Oct 17 '11 at 17:11
  • 5
    It's akin to naming a string variable like this: `string stringVariable = '';` – StuperUser Oct 17 '11 at 17:14
  • 63
    You have a table...it's called People, it's got a column, called Id, what do you think the Id is? A car? A Boat? ...no it's the People Id, that's it. I don't think it's not a bad practice and it's not necessary to name the PK column anything other than Id. – jim Oct 17 '11 at 18:17
  • 3
    What was the teachers preferred naming convention? – dotjoe Oct 17 '11 at 20:10
  • 5
    Naming your PK Table_PK or Table_ID is extremely helpful when you're doing joins with FKs and using join tables, as X table's PK is ALWAYS table_ID, none of this table.id=jointable.table_id stuff – Ben Brocka Oct 17 '11 at 20:33
  • 6
    Well if you do like table.ID you're in good company. see http://data.stackexchange.com – Conrad Frix Oct 17 '11 at 22:23
  • 2
    Anyone can call something bad practice. Something has to have a reason to be a bad practice. Next time, (politely) challenge the teacher to explain why it's bad practice. – Kyralessa Oct 17 '11 at 23:36
  • 3
    I'm amazed this question has been up voted so much. This is a question of personal style that has turned into a religious war. In the real world, who cares? Follow the style your team uses. – Brook Oct 19 '11 at 13:27
  • 43
    So the teacher gets in front of the class and tells you this is a bad practice without a single reason? That's a worse practice for a teacher. – JeffO Oct 19 '11 at 14:30
  • 15
    Because it fights with the column named 'Ego' ? – Martin Beckett Apr 17 '12 at 15:42
  • 1
    Did it occur to you to ask your teacher why they hold that opinion? – Bryan Oakley Apr 20 '12 at 10:56
  • Sometimes the choice a teacher has due to time constraints and the scope of a course are: Informing about, but not explaining X vs Not informing about X. Which option is chosen depends on the course and the scope of a course. If for example a teacher is given the order to teach Python, but not object oriented programming, some things are just left at "use the dot to access functions that belong to the variable", rather than explaining that some things are called methods, some functions and that some things can be instances etc. – fnurl Dec 06 '12 at 12:59
  • Bad practice? Depends on context. Personally I find it better to follow a convention of NounId or TablenameId, but I do a lot of code generation, binding by convention, meta-data based logic and so forth wherein being able to infer the table from the Id's name can be very convenient. Also, when joining / creating views it is convenient to not have to ALIAS "Id" all over the place to avoid naming collisions in the flattened structure. But that aside I would say that this is much more of a PREFERENCE than a PRACTICE, good bad or otherwise. – Ed Hastings Dec 27 '12 at 22:11
  • 1
    Id is common practice if you are a .net developer using Entity Framework. For years now EF 4.x and now 5.0 use Id here is one example http://blogs.msdn.com/b/efdesign/archive/2010/06/01/conventions-for-code-first.aspx – Tom Stickel Mar 11 '13 at 06:13

19 Answers19

262

I'm going to come out and say it: It's not really a bad practice (and even if it is, its not that bad).

You could make the argument (as Chad pointed out) that it can mask errors like in the following query:

SELECT * 
    FROM cars car
    JOIN manufacturer mfg
        ON mfg.Id = car.ManufacturerId
    JOIN models mod
        ON mod.Id = car.ModelId
    JOIN colors col
        ON mfg.Id = car.ColorId

but this can easily be mitigated by not using tiny aliases for your table names:

SELECT * 
    FROM cars
    JOIN manufacturer
        ON manufacturer.Id = cars.ManufacturerId
    JOIN models
        ON models.Id = cars.ModelId
    JOIN colors
        ON manufacturer.Id = cars.ColorId

The practice of ALWAYS using 3 letter abbreviations seems much worse to me than using the column name id. (Case in point: who would actually abbreviate the table name cars with the abbreviation car? What end does that serve?)

The point is: be consistent. If your company uses Id and you commonly make the error above, then get in the habit of using full table names. If your company bans the Id column, take it in stride and use whatever naming convention they prefer.

Focus on learning things that are ACTUALLY bad practices (such as multiple nested correlated sub queries) rather than mulling over issues like this. The issue of naming your columns "ID" is closer to being a matter of taste than it is to being a bad practice.


A NOTE TO EDITORS : The error in this query is intentional and is being used to make a point. Please read the full answer before editing.

riwalk
  • 7,660
  • 3
  • 29
  • 32
  • 1
    @Chad, it was a reference to the fact that in that particular query you used 3 letters for aliases to table names, even when it made no sense (`cars` -> `car`. Thank God--you saved my fingers). Don't read too deeply into it. – riwalk Oct 17 '11 at 17:32
  • 4
    worse than my alias names, was my mixing of plurality `cars` and `manufacturer`. One is plural, the other isn't. If people want to pick at the db, that's the bad practice that should be picked upon. – CaffGeek Oct 17 '11 at 17:34
  • 1
    @Chad, completely true. The thing that drives me crazy about that one is that I do it without thinking and then hate myself later :) – riwalk Oct 17 '11 at 17:36
  • I do it all the time too...it pisses me off because I don't notice until it's too late to change without great difficulty. – CaffGeek Oct 17 '11 at 17:38
  • 3
    I think it is bad practice. Obviously, as far as bad practice goes it's not terrible.. but it's so easy to avoid, why not do so? Now, I agree, for an intro class, is this the thing to focus on? Probably not.... – user606723 Oct 17 '11 at 19:41
  • 2
    @user606723, actually, I think for an intro class it's an important thing to make a point of. Teaching best practices should be fundamental. It's not until you're experienced that you understand the consequences, and tradeoffs and when you should deviate from best practices. – CaffGeek Oct 17 '11 at 19:53
  • 5
    @Chad, Agree to disagree. Trying to teach students "best practices" without allowing them to understand why those are the best practices is a futile effort. And since you can't cover everything, glossing over this point with "you just shouldn't do it, you'll figure out why later" is pretty sane choice from a professor's standpoint. Curious students can post on here or hopefully find this question already answered. (Or ask after class) – user606723 Oct 17 '11 at 20:01
  • And the majority of students unfortunately will go on ignoring best practices and have to be "retrained" when they are employed...IT needs an apprenticeship path like tradespeople have. – CaffGeek Oct 17 '11 at 20:07
  • 1
    Why did you pick multiple nested correlated sub queries as a canonical bad practice? I don't really see that technique as overused. I would have picked concatenating values to SQL strings (i.e. SQL Injection). – Conrad Frix Oct 17 '11 at 22:19
  • @ConradFrix, simply because it was a problem I had to deal with recently--no other reason than that. – riwalk Oct 17 '11 at 22:20
  • I always find myself lost in corner cases like this. +1 good advise. Sorry I took away your perfect 80. :P – Wolfpack'08 Oct 20 '11 at 02:39
  • A side note: "cars" should not be abbreviated as "car", but "crs". That doesnt look much better, and I totally agree that using 3 letter abbreviations are a bad thing. – awe Oct 20 '11 at 12:23
  • Naming every PK `Id` by itself isn't a bad practice, but it leaves you having to write out tedious `ON` clauses instead of using `NATURAL JOIN` or `JOIN USING`. I wonder if the teacher didn't mean that adding surrogate `Id` columns everywhere instead of actually thinking about your model -- a la Rails -- is the bad practice. – Jesse Dhillon Oct 08 '15 at 05:09
  • A big reason to use "Id" over "tablenameId" is many ORMs encourage that use (eg Dapper Contrib) & require less configuration if "Id" is used. Under those circumstances "Id" could be argued as *best practice*. – niico May 27 '16 at 12:29
  • Isn't the examples in the wrong order. You have an example using shorter aliases and the text then goes on to say that you can use short aliases followed by an example that uses the full names. At least to me that's somewhat confusing. – skyking Feb 03 '19 at 06:04
  • @skyking That isn't what the text of the answer says at all. > "[T]his can easily be mitigated by **not** using tiny aliases for your table names:" – riwalk Feb 10 '19 at 22:26
  • @riwalk OK, I first thought that the problem that was to be mitigated was repeating long names. Now I see that you refer to a problem that is pointed out by Chad which doesn't seem to be found anywhere. The order is correct right, but I'm still confused about what problem you're avoiding with the second query then. – skyking Feb 11 '19 at 05:35
  • @skyking Both queries are exactly the same and both are completely wrong. It's easier to see *why* it's wrong in the second one. There's nothing more to be said. – riwalk Feb 26 '19 at 21:39
  • Very well done sir! I fully agree! Be pragmatic and focus on the issues that happen and what affects you most, and not on matters of taste, because, like you demonstrated, both options are perfectly valid! – Paul-Sebastian Manole Feb 21 '22 at 21:44
133

Because when you have a table with a foreign key you can't name that foreign key "Id". You have table name it TableId

And then your join looks like

SELECT * FROM cars c JOIN manufacturer m ON m.Id = c.ManufacturerId

And ideally, your condition should have the same field name on each sides

SELECT * FROM cars c JOIN manufacturer m ON m.ManufacturerId = c.ManufacturerId

So while it seems redundant to name the Id as ManufacturerId, it makes it less likely that you have errors in your join conditions as mistakes become obvious.

This seems simple, but when you join several tables, it gets more likely you'll make a mistake, find the one below...

SELECT * 
    FROM cars car 
    JOIN manufacturer mfg
        ON mfg.Id = car.ManufacturerId
    JOIN models mod
        ON mod.Id = car.ModelId
    JOIN colors col
        ON mfg.Id = car.ColorId

Whereas with proper naming, the error sticks out...

SELECT * 
    FROM cars car 
    JOIN manufacturer mfg
        ON mfg.ManufacturerId = car.ManufacturerId
    JOIN models mod
        ON mod.ModelId = car.ModelId
    JOIN colors col
        ON mfg.ManufacturerId = car.ColorId

Another reason naming them Id is "bad" is that when you are querying for information from several tables you will need to rename the Id columns so you can distinguish them.

SELECT   manufacturer.Id as 'ManufacturerId'
        ,cars.Id as 'CarId'
        --etc
    FROM cars 
    JOIN manufacturer
        ON manufacturer.Id = cars.Id

With accurate names this is less of an issue

CaffGeek
  • 8,033
  • 5
  • 32
  • 38
  • 193
    Not sure this is a good enough explanation to me. There's nothing wrong with saying `SELECT * FROM cars c JOIN manufacturer m ON manufacturer.Id = c.ManufacturerId`. I have used `id` for years and never found what you described to be a real problem. – riwalk Oct 17 '11 at 16:56
  • 64
    I would say that the bad practice here is to alias tables with name like mfg or mod. manufacturers.id = cars.manufacturer_id is very readable and the error will stick out too. – deadalnix Oct 17 '11 at 16:59
  • @deadalnix, that's hardly a bad practice. Variable name and specificity is tied to it's scope. That table prefix is being used in a single statement, then it goes out of scope. It's unique and descriptive enough in it's context. For the same reason we can safely use `i, j, k, ...` as iterators in a loop. – CaffGeek Oct 17 '11 at 17:05
  • @deadalnix, and while using the table name explicitly seems like a great practice, when you start getting into self referencing tables, you require aliases. Aliases allow you a couple benefits. First, they save typing. But more importantly, they provide an interface. I can swap out the manufacturers table for vw_manufacturers (which filters out deleted records, or whatever) at any time, without changing the rest of my query. – CaffGeek Oct 17 '11 at 17:07
  • 7
    @Chad > I already got problems with dumb variable names. Many times. For the reccord, here what I would say to a dev that does this in my team « mfg doesn't mean manufacturer, it means you are to lazy to type manufacturer ». – deadalnix Oct 17 '11 at 17:12
  • @Chad Several references to a table can be achieved using a common alias. like JOIN users as authors ON messages.author_id = authors.id . This is our naming convention. I have no claim that it is the best. The point is being consistent. – deadalnix Oct 17 '11 at 17:14
  • 10
    @Stargazer712: Doing SELECT * from 2 tables gives 2 x ID columns. ID is now ambiguous: do you reference by ordinal or name? SELECT * is not good practice either. Poor arguments. Fragile code. Chad is correct: defensive coding basically – gbn Oct 17 '11 at 18:23
  • 7
    @gbn, again, all ambiguity is gone if you simply do `SELECT manufacturer.id FROM ...`. Every difficulty resulting from `id` can be overcome very easily, making it simply a matter of taste. – riwalk Oct 17 '11 at 19:25
  • 1
    I quite like the idea of `id`, with `tablename.id` where needed. It's like the implicit reference through `this` in OOP languages instead of explicitly restating which objecting you're referring through. Why keep stating which `id` when it's the obvious one for this context? Having long names just means you have to state the whole long name explicitly all the time, even when it's just clutter. –  Oct 18 '11 at 03:00
  • @Steve314, because relational databases are OOP – CaffGeek Oct 18 '11 at 15:58
  • 1
    gbn - Why is having multiple columns with the same name a problem? You might have numerous tables with a `PhoneNumber` column. So. Without the table alias just having the column name isn't enough. – Thomas Oct 18 '11 at 17:25
  • 1
    @Chad - I'm not sure if that's sarcasm or not. It doesn't really matter either way. For readability, `thingamy.id` is precisely as readable as `thingamy_id`. Sometimes, `id` may be more readable - but if you called it `thingamy_id` you don't get the choice to drop the `thingamy_` prefix. Whatever branch of theory or whatever, it's just a notation, and the relevant issue is readability - the OOP thing in my comment was just an analogy. –  Oct 19 '11 at 00:17
  • @Steve314, Another reason for not naming it just "Id" is when you join two tables together you're likely going to want the Id from both tables correct? So you're going to have to rename the two Id columns in the result anyhow so you can distinguish one from the other. With proper names, this isn't an issue. – CaffGeek Oct 19 '11 at 13:18
  • @Chad - yes, *sometimes* you need the longer name, but not always. The minor hassle of *sometimes* setting that up is IMO outweighed by *frequently* having the option to drop the clutter. –  Oct 19 '11 at 13:35
  • @Steve314, I disagree. If the name of the field is CarId. Then I can be 99% sure that it will always be named CarId. Regardless of if it is being returned from the table directly, a view, function, or stored proc. However, if it's just named Id and relies on the programmer naming it CarId then it will sometimes be CarId, sometimes just be Id, other times it might be cId, or AutoId, or VehicleId, depending on what the programmer was thinking when writing the proc or view. It breaks the nice consistent model your database is supposed to be providing. – CaffGeek Oct 19 '11 at 14:04
  • @dotjoe, I'd rather attempt to give most columns unique names, than have every table look nearly identical and spend all my time aliasing columns. And a table having multiple foreign keys to the same table is an exception, most tables don't. And it's still not broken. The field name just becomes longer. If I have a Person record, that has two foreign keys to their parents based on PersonId, the columns would be not be named PersonId, as it's not descriptive enough. They would be named MotherPersonId, and FatherPersonId. – CaffGeek Oct 19 '11 at 18:48
  • @Chad - I'm fine with you thinking that. Part of my opinion was based on an "IMO", and therefore it's subjective. What I'm less fine about is being told my preference, which has reasoning behind it, is bad practice due to alternative views that are equally subjective. That need for the same data to always have the same name with the exact same spelling, even when it's playing a different role in a different context, is clearly a subjective viewpoint too. –  Oct 19 '11 at 19:11
  • We'll have to agree to disagree. I've given a few advantages to explicit naming. The only advantage I've seen to naming it just `Id` is it saves some typing in simplest of queries. Thus, I won't be convinced that it's not a bad practice. – CaffGeek Oct 19 '11 at 19:31
  • 1
    @Chad - typing isn't the issue - clutter reduces readability and maintainability. –  Oct 19 '11 at 20:29
  • @Steve314, I don't see how naming things properly is hardly clutter? I would say that requiring a bunch of aliases is far more clutter than a few extra characters in a field name – CaffGeek Oct 19 '11 at 20:56
  • People supporting plain `id` didn't consider with code maintenance and ease of debugging. Unfortunately , there is no statistic on frequency of mistake made on foreign key `join` , and the time wasted to debug them – mootmoot Apr 27 '17 at 08:49
  • I find your 3rd "select" much easier to read and to understand than the 4th one which is supposed to be "better"... and I will use your examples to say why I prefer the 3rd over the 4th... – Olivier Pons May 22 '17 at 15:10
  • The error in this answer's example seems a bit contrived, and the solution is a false dilemma. Rather than using more descriptive PK names, you could just as easily use more descriptive aliases, or as was pointed out, just dereference with the table names themselves. And as far as the error the convention makes "stick out" in your example, I'm not going to endorse a naming convention just to cater to people who are sloppy with their cut-paste practices. – Abion47 Sep 13 '18 at 18:56
75

Ruby's ActiveRecord library and Groovy's GORM use "id" for the surrogate key by default. I like this practice. Duplicating the table name in each column name is redundant, tedious to write, and more tedious to read.

kevin cline
  • 33,608
  • 3
  • 71
  • 142
  • 33
    +1 for "and more tedious to read." - naming conventions shouldn't be thought of as a band-aid for sloppy code, they should be improving readability as a primary concern. – ocodo Oct 18 '11 at 01:30
  • 15
    ID is far more tedious to read – HLGEM Oct 18 '11 at 20:20
  • 7
    @HLGEM: One can always qualify the column name with the table name. – kevin cline Oct 20 '11 at 21:10
  • 2
    I would agree except for the more tedious to read. I actually prefer reading more descriptive column names and spending less time figuring out what that column actually is for. – Devin Dixon Mar 20 '13 at 20:41
  • @DevinDixon: if you create debug views, then just once you need to assign better names to the view columns. But if you redundantly include the table name in the column names, all clients have to repeat them all the time whether or not it is useful. – kevin cline Mar 20 '13 at 22:17
  • 2
    +1, Hate seeing tables with columns like Posts.postID, Posts.postName where simply using post.id and post.name is far prettier. – Doug May 12 '17 at 00:14
40

Common or key column names like "Name" or "Id" should be prefixed with the TableName.

It removes ambiguity, easier to search for, means far less column aliases when both "Id" values are needed.

A lesser used or audit column or non-key (say LastUpdatedDateTime) doesn't matter

gbn
  • 2,467
  • 1
  • 17
  • 14
  • 65
    If you do this, I hate you for making me do extra typing!!!! The table's name is Person, what do you think the Id is going to be? Car? no, it's Person. – jim Oct 17 '11 at 18:19
  • 18
    @jim, I don't know about you, but typing 6 extra characters takes me roughly half a second. And considering I rarely ever select from one table, and thus would end up with two columns named 'Id' and will need to include the table/alias name anyhow, there is no savings in the number of characters typed. – CaffGeek Oct 17 '11 at 19:13
  • 24
    @Chad I find it superfluous. if I'm doing a join, c.id = m.manufacturerid, is ok with me. These columns are typically "mapped" to a class somehow, and to have a class with Person.PersonId makes me want to vomit...Yes,I am fully aware I have issues. – jim Oct 17 '11 at 19:26
  • 3
    @jim, I wouldn't name it PersonId in a class, as with classes, I always have to write Person.Id. SQL is different. You can't apply the same rules across all languages. – CaffGeek Oct 17 '11 at 19:41
  • @Chad I'll buy that. but now I'm assuming that Id in the class is mapped to the PersonId in the table. To each their own. – jim Oct 17 '11 at 19:45
  • 9
    @jim: Relational databases are not object databases. The fact that classes are designed a certain way doesn't mean that the same principles apply to databases. For example, m:m relationships are trivial between two object types and impossible between two tables without an association table. – Adam Robinson Oct 18 '11 at 01:21
  • 7
    downvoted for not being DRY (Don't Repeat Yourself). Having the table in the column name is reduntant. So users.users_name is better than users.name? Use the context of the situation. The only benefit for having a more robust primary key column is so you can do "table a joins table b using (a_id)". However, most programers today use some sort of ORM framework that builds the SQL for them in 99% of cases, hence eliminating this use case altogether. – Bill Leeper Oct 18 '11 at 15:46
  • 6
    @Bill Leeper: Many shops don't use ORMs. If 99% of developers used ORMS we'd have no "stored procs" vs "inline SQL" argument. That's your experience which isn't th general case – gbn Oct 18 '11 at 16:47
  • 22
    I also disagree with this. Why stop at `name` and `id`? Why not have every column prefixed with its table name? It seems arbitrary to pick those two names to mandate a prefix. Conceptually, you must have the table in order to have the context of a column anyway. Why not just use that table name to clarify the query: Person.Name, Animal.Name, Part.Name,... – Thomas Oct 18 '11 at 17:27
  • @Thomas: then you need aliases in every query... – gbn Oct 18 '11 at 17:28
  • 5
    @gbn - You do need to use either an alias or the table name and you should anyway. In any query with multiple tables, you should explicitly declare the table or alias to which each column belongs. – Thomas Oct 18 '11 at 21:48
  • 12
    @Bill Leeper, DRY is not always appropriate in database development. In databases what is important is performance and making the database do extra work to fullfill DRY principles (such as using scalar functions or views that call views or queries that return too many columns or using a cursor to add 1000000 records to use an existing proc) is often contraindicated. Do not think that just because something is good in the Object-oriented world that it is appropriate in database design. Your downvote was inappropriate. Using ID is a known SQL antipattern. – HLGEM Oct 20 '11 at 21:38
  • 5
    I would say however, that in the database world, it is pretty rare that you just have "The Database", more often it is part of a much more complex system, a system that involves fare more lines of code, complexity, and performance bottlenecks. The DBA should be working WITH the application developers not AGAINST them. Too often people sit on these types of outdated or one sided arguments. If the database was designed in harmony with the application it wouldn't needs views etc, except in rare occasions where logic is being encapsulated in the db. – Bill Leeper Oct 21 '11 at 17:31
  • 1
    +1 for pointing out that it makes your database code so much easier to search through. In large database projects with hundreds of thousands of lines of SQL code, it is crucial to be able to see all references to an id column in tables, variables and temp tables (which may or may not be connected through a foreign key). TableNameId as a nomeclature simply scales much better in this case. – tep Sep 15 '16 at 22:38
  • 2
    This is a Good Answer- Using Table name prefix in the primary key.We need to focus more on consistency and clarification rather than finding a way to type less. For others, who wants to argue of typing less,go improve your typing speed. It's not about typing more/less. – kta Dec 12 '16 at 00:06
  • although I tend to favor not prefixing ID column, @kta is totally right about arguments of typing less and other related stuff putting coder comfort ahead of code presentation. – Andre Figueiredo Apr 27 '17 at 13:39
36

This thread is dead, but I would like to add that IMO not using Id is a bad practice. The Id column is special; it is the primary key. Any table can have any number of foreign keys, but it can have only one key that is primary. In a database where all primary keys are called Id, as soon as you look at the table you know exactly which column is the primary key.

Believe me, for months now I've spent all day every day working in lots of big databases (Salesforce) and the best thing I can say about the schemas is that every table has a primary key called Id. I can assure you I absolutely never get confused about joining a primary key to a foreign key because the PK is called Id. Another thing that people haven't mentioned is that tables can have long silly names like Table_ThatDoesGood_stuff__c; that name is bad enough because the architect had a hangover the morning he thought up that table , but now you are telling me that it's bad practice not to call the primary key Table_ThatDoesGood_stuff__cId (remembering that SQL column names aren't in general case sensitive).

To be honest, the problems with with most people who teach computer programming are that they haven't written a line of production code in years, if ever, and they have no idea what a working software engineer actually does. Wait until you start working and then make up your own mind what you think is a a good idea or not.

  • 2
    That's only the case if none of your primary keys is a composite key, which is, unfortuately, far too often the case. One should really only use surrogate keys in particular circumstances. – nicodemus13 May 17 '12 at 16:32
  • 7
    Using Id like that you end up with a situation where without thinking developers add primary key id to each and every table they make. One of the foundations of relational databases is the use meaningfull and aggregate primary keys and using id does not help. – Pieter B Oct 20 '12 at 14:35
  • This answer just seems like you're saying "I prefer Id" with opinionated arguments. Your argument is you can instantly see which key is the primary key by finding the one called Id. Well, _it's the exact same with tableId_. I can guarantee you I never get confused which key is the primary key either. I just look for the one that has the table name before the id. Not only that, but what kind of heathen tables are you working with where the first column isn't the primary key? I feel like all your arguments in this answer are purely preferential based and akin to "tableId feels wrong to me". – dallin Oct 13 '18 at 20:53
  • 1
    @dallin all the answers are opinionated, otherwise someone would just link to the official standard, and there isn't one! – James Jun 03 '19 at 23:44
  • @James I feel like the goal of StackExchange sites is to reduce opinionated answers. That's why questions get closed as being too opinionated. Granted, I think that's kind of why this question got closed - because it elicits opinionated answers, but I feel this specific answer was overly opinionated without any real supporting facts or arguments based on facts.The entire answer can be summarized by saying, "In my opinion, you should use Id, just because that's what I like". That's not a valuable answer. – dallin Jun 05 '19 at 02:28
29

From data.stackexchange.com

Id in Posts

BOOM, question answered.
Now go tell your teacher that SO practice bad database design.

Cyril Gandon
  • 1,296
  • 1
  • 11
  • 17
  • 10
    My guess at the FKs, based on the names: `PostTypeId -> PostTypes.Id`; `AcceptedAnswerId -> Answers.Id`; `OwnerUserId -> Users.Id`. Why should a practice that is that easy be considered 'bad'? – Sjoerd Apr 18 '12 at 00:11
  • 4
    How exactly does this prove anything about best practices? – GBa Apr 18 '12 at 15:52
  • 7
    Whether something is used at stack or not does not prove if it's good or bad practice. – Pieter B Oct 20 '12 at 14:39
  • 2
    What it does prove is that this practice in no way prohibits the scalability and usefulness of an application. – Cypher Feb 13 '15 at 20:49
  • 1
    Actually SO practice is not perfect. I would use this naming: PostType -> PostType.Id; AcceptedAnswer -> Answer.Id; OwnerUser -> User.Id – alpav Feb 17 '15 at 02:10
24

I don't consider it bad practice. Consistency is king, as usual.

I think it's all about context. In the context of the table on its own, "id" just means exactly what you expect, a label to help uniquely identify it against others that might otherwise be (or appear) identical.

In the context of joins, it's your responsibility to construct the joins in such a way as to make it readable to you and your team. Just as it is possible to make things look difficult with poor phrasing or naming, it is equally possible to construct a meaningful query with effective use of aliases and even comments.

In the same way a Java class called 'Foo' doesn't have its properties prefixed by 'Foo', don't feel obliged to prefix your table IDs with table names. It is usually clear in context what the ID being referred to is.

lotsoffreetime
  • 1,121
  • 1
  • 8
  • 8
  • 5
    Relational database tables *are not classes*. – Adam Robinson Oct 18 '11 at 01:23
  • 1
    They are however data structures and they're analogous to PODO classes. The same naming problems apply. – ocodo Oct 18 '11 at 01:28
  • 4
    @Slomojo: No, they're not analogous to simple objects. Object-oriented design and database design are not the same, and are not even related. While they can, in some cases, yield similar (or even the same) design, that does not indicate that they are related. For example, m:m relationships are trivial in classes, but are impossible have between two tables without a third association table. – Adam Robinson Oct 18 '11 at 01:55
  • 1
    Quite how this relates to a naming strategy, I don't know. My analogy is (clearly?) only scoped to that extent. – ocodo Oct 18 '11 at 02:13
  • @Slomojo: It relates by way of the fact that the basis for your assertion (that tables are analogous to simple classes) is false. – Adam Robinson Oct 18 '11 at 12:34
  • 2
    I'm sorry my implied meaning wasn't very clear, I should have said "**in this sense** they are analogous to classes". Either way, I don't think being overly pedantic about this is particularly constructive. In terms of naming, tables and classes do share a significant amount of similarities. Best practices which develop in a cul-de-sac are fair game for revision, or at the very least are open to discussion. There's plenty within this Q&A that illustrate this effectively, I don't have anything else of note to add. – ocodo Oct 18 '11 at 22:36
  • The problem is in classes, there's _always_ context as to which object the property belongs to. In databases, this is not always the case. There are times when you can sometimes only use a column name. In these cases, you have to look at the entire sql statement to discern the context and whether it is ambiguous, which leads to more potential for mistakes in writing sql statements. Not only that, but many DBA's forget that programmers work with tables too, and in many instances, stored column names are not attached to table names in code. – dallin Oct 13 '18 at 21:01
17

It makes it hard (and confusing) to perform a natural join on the table, therefore yeah, it's bad if not very bad.

Natural Join is an ancient artifact of SQL Lore (i.e. relational algebra) you may have seen one of these: ⋈ in a database book perhaps. What I mean is Natrual Join is not a new fangled SQL idea, even though it seemed to take forever for DBMS's to have implemented it, therefore it's not a new fangled idea for you to implement it, it might even be unreasonable for you to ignore its existence nowadays.

Well, if you name all your primary key's ID, then you lose the ease and simplicity of the natural join. select * from dudes natural join cars will need to be written select * from dudes inner join cars where cars.dudeid = dudes.id or select * from dudes inner join cars where dudes.carid = cars.id. If you are able to do a natural join, you get to ignore what the relation actually is, which, I believe, is pretty awesome.

Peter Turner
  • 6,897
  • 1
  • 33
  • 57
  • Unless you are writing a stored procedure when is the last time as an application developer your actually wrote a fully formatted SQL selector? Modern languages all include some sort of ORM feature that manages this relationship for you. The column name is far more important than being able to write clean manual SQL. – Bill Leeper Oct 18 '11 at 15:48
  • 4
    @Bill I do all the time, many, many times a day, depends more on your codebase than the language you're developing. And, for diagnostics, if you want to do some good and deep relations you can string together those natural joins and completely avoid looking up field ID's. And, as St. Jerome famously said, "Ignorance of SQL is ignorance of databases". – Peter Turner Oct 18 '11 at 15:57
  • Aside from the fact that natural joins are not universally supported, IMO, natural joins are harder to read. Are there two columns in relationship or only one? Far better to be explicit and avoid natural joins. – Thomas Oct 18 '11 at 17:33
  • 1
    @Thomas, I wouldn't put natural joins in code either, but for diagnostics, I've found them pretty useful when the database is modeled so that they actually work. – Peter Turner Oct 18 '11 at 17:40
17

There is a situation where sticking "ID" on every table isn't the best idea: the USING keyword, if it's supported. We use it often in MySQL.

For example, if you have fooTable with column fooTableId and barTable with foreign key fooTableId, then your queries can be constructed as such:

SELECT fooTableId, fooField1, barField2 FROM fooTable INNER JOIN barTable USING (fooTableId)

It not only saves typing, but is much more readable compared to the alternative:

SELECT fooTable.Id, fooField1, barField2 FROM fooTable INNER JOIN barTable ON (fooTable.Id = barTable.foTableId)
Izkata
  • 6,048
  • 6
  • 28
  • 43
  • This is the answer that sticks out most for me. The `USING` keyword is supported by postgres/mysql/sqlite database, means less typing which some of the other answers list as a reason for using `id`, and finally in my subjective opinion is more readable. – Michael Barton Oct 11 '16 at 17:25
12

Why not just ask your teacher?

Think about this, when all your tables PK columns are named ID it makes using them as foreign keys a nightmare.

Column names need to be semantically significant. ID is to generic.

  • 8
    too generic for what? the id of a table? – jim Oct 17 '11 at 18:24
  • 3
    @Jim of which table? `id` alone doesn't mean anything, especially in the context of a foreign key to another table. This has nothing to do with `classes` and everything to do with good basic fundamental relational database design. –  Oct 17 '11 at 21:32
  • 11
    To be slightly fatuous, the table which it belongs to. `table.id` is a perfectly acceptable way of referring to an `id` field. Prefixing the field name with the table name is redundant. – ocodo Oct 18 '11 at 01:21
  • 3
    @Slomoj it is no more typing than including the name in the column and more explict when aliasing table names to single or double letter abbreviations in monster joins. –  Oct 18 '11 at 03:06
  • 6
    Of what nightmare are you referring? Suppose you have a self-referencing structure of employees with a column representing their manager. What do you call the foreign key? You can't call it EmployeeId as that is presumably your PK. The name of the FK does not have to match the PK. It should be named for what it represents to the entity in which it is contained. – Thomas Oct 18 '11 at 17:30
  • 1
    If the PK is a surrogate key, then it makes sense for it to be the generic 'ID' and the semantics are improved by doing so, as the column has nothing in common with the domain. – Michael Easter Oct 19 '11 at 19:21
7

ID is bad for the following reasons:

If you do a lot of reporting queries you always have to alias the columns if you want to see both. So it becomes a waste of time when you could name it properly to begin with. These complex queries are hard enough (I write queries that can be hundreds of lines long) without the added burden of doing unnecessary work.

It is subject to causing code errors. If you use a database that allows the use of the natural join (not that I think you should ever use that but when features are available somebody will use them), you will join on the wrong thing if you get a developer that uses it.

If you are copying joins to create a complex query, it is easy to forget to change the alias to the one you want and get an incorrect join. If each id is named after the table it is in, then you will usually get a syntax error. It is also easier to spot if the join ina complex query is incorrect if the pPK name and the FK name match.

HLGEM
  • 28,709
  • 4
  • 67
  • 116
  • +1: These are compelling reasons in my opinion, while the other answers opposing `ID` don't convince me at all. – phoog Apr 19 '12 at 20:31
  • Re: "If you are copying joins to create a complex query" - so your problem is copy&paste. Stop copy&paste and you will see how convenient car.id naming is. For FK joining use car.mfg = mfg.id, car.color=color.id, car.model = model.id - very simple and matches what you would write in LINQ. – alpav Feb 17 '15 at 02:06
  • Try writing something with 30 joins and you will see why it is an antipattern. – HLGEM Feb 18 '15 at 14:19
  • Aliasing is the first and foremost reason - it is such a headache when doing big complex reports. Natural joins are just a sweet bonus. It's amazing how other answers just ignore this points. I think adding some examples would make the point clearer and jump this answer higher to where it should be. – Lulu Jul 29 '18 at 19:01
  • With your solution you still have to alias the FK or PK because they have the same name in both tables. So your proposed solution creates more burden of typing and having to read longer descriptions. In longer queries you also have no idea when looking at table1.primary_key_name_id = table2.primary_key_name_id where the primary key actually resides. Disagree. – rball Jan 18 '23 at 17:08
6

The practice of using Id as primary key field leads to the practice where id gets added to every table. A lot of tables already have unique information that uniquely identifies a record. Use THAT as primary key and not an id field you add to each and every table. That's one of the foundations of relational databases.

And that's why using id is bad practice: id is often not information just an autoincrease.

consider the following tables:

PK id | Countryid   | Countryname
    1 |         840 | United States
    2 |         528 | the Netherlands

What's wrong with this table is that it enables the user to add another line: United States, with countrycode 840. It just broke relational integrity. Ofcourse you can enforce uniqueness on individual columns, or you could just use a primary key that's already available:

PK Countryid   | Countryname
           840 | United States
           528 | the Netherlands

That way you use the information you already have as primary key, which is at the heart of relational database design.

Pieter B
  • 12,867
  • 1
  • 40
  • 65
  • 1
    You'll understand why people add a generic key column to every table when you've had to migrate databases between differing systems, or have had the pleasure of merging databases together. – Cypher Feb 13 '15 at 20:56
  • 1
    This is occasionally the case, but in my experience it's pretty uncommon to have a nice unique immutable key. But even in your example you have a meaningless unique number to identify countries, just one that's allocated by ISO not by your database. – CodesInChaos Jun 22 '15 at 14:33
  • And now when the country name changes, you've got to update it throughout your entire database to fix. If you had simply used a surrogate key (like..."id"), the update would be trivial. Natural keys are great - when they are completely immutable, and never change. There are actually very few cases where this is true. – Gerrat Mar 06 '18 at 20:31
  • 1
    @Gerrat: If a country name changes, the ISO 3166-1 numeric code stays the same, only the ISO 3166-1 alpha-2 and alpha-3 codes change. This happened with Burma / Myanmar, for example. Likewise, if a country changes its territory but keeps its name, the ISO 3166-1 numeric code changes, but the ISO 3166-1 alpha-2 and alpha-3 codes stay. This happened when South Sudan split from Sudan and Eritrea split from Ethiopia. When East and West Germany reunited, they kept the alpha-2 and alpha-3 codes of West Germany but got a new numeric code. When countries completely dissolve or transform (think … – Jörg W Mittag Jun 09 '19 at 15:18
  • … outcome of the Balkan wars of the 90s), they get both new numeric and alpha-2 and alpha-3 codes. – Jörg W Mittag Jun 09 '19 at 15:19
  • @JörgWMittag: Basically, the country code here is already a surrogate key, and so this example is somewhat pointless itself. That doesn't diminish the very good practice of using surrogate keys everywhere. I have seen so many cases where a natural, never-thought-to-ever-change key was used, only to have a need to change it years later, causing a huge headache. Giving the surrogate key a more descriptive name than: `id` may be a good idea, but this answer isn't about that. – Gerrat Jun 11 '19 at 14:52
  • @Gerrat: I live in Germany, so I can definitely attest to the fact that "obvious" natural keys turn out to be not so key-ish after all. For example, after Germany re-united, we had duplicate zip codes for a while, and then we got a completely new system for zip codes. Where before larger cities had shorter zip codes, and the longest codes were 4 digits, now *all* cities have 5-digit zip codes. Where before, large city would have a single zip code and you would add the district to the name, in the new system, districts had their own zip codes, and adding the district to the name was … – Jörg W Mittag Jun 11 '19 at 19:11
  • … discouraged. So, literally not a single city kept its old zip code, many addresses which had the same zip code before now had different zip codes, and so on. In the specific case of ISO 3166 however, the codes are specifically designed as keys (they are based on the codes the UN statistics office uses). Whether they are actually keys for what you think they are is of course another question entirely. – Jörg W Mittag Jun 11 '19 at 19:14
5

There are some answers that approach what I would consider the most important reason for not using "id" as the column name for the primary key in a table: namely consistency and reduced ambiguity.

However, for me the key benefit is realized by the maintenance programmer, in particular one who was not involved with the original development. If you used the name "PersonID" for the ID in the Person table and consistently used that name as a foreign key, it is trivial to write a query against the schema to find out what tables have PersonID without having to infer that "PersonID" is the name used when it is a foreign key. Remember, right or wrong, foreign key relationships are not always enforced in all projects.

There is an edge case where one table may need to have two foreign keys to the same table, but in such cases I would put the original key name as the suffix name for the column, so a wildcard match, %PersonID, could easily find those instances as well.

Yes, much of this could be accomplished by a standard of having "id" and knowing to always use it as "tableNameID", but that requires both knowing that the practice is in place and depending on the original developers to follow through with a less intuitive standard practice.

While some people have pointed out that it does require some extra key strokes to write out the longer column names, I would posit that writing the code is only a small fraction of the active life of the program. If saving developer keystrokes was the goal, comments should never be written.

As someone who has spent many years maintaining large projects with hundreds of tables, I would strongly prefer consistent names for a key across tables.

Malachi
  • 168
  • 6
  • Suppose a `Companies` table has 2 foreign keys to a `Persons` table. One represents the company's President; the other represents the company's Treasurer. Would you really call the columns `PersonID1` and `PersonID2`? It would be far more descriptive to call them `PresidentID` and `TreasurerID`. I find it much easier to read `inner join Person AS Presidents ON Company.PresidentID = Presidents.ID` than `inner join Person AS Person1 ON Company.PersonID1 = Person1.PersonID` – phoog Apr 19 '12 at 20:28
  • 1
    No. In your example I would probably have a `CompanyOfficer` or `CompanyPerson` table which allows a many-to-many relationship with between `Company` and `Person` with additional information about the nature of the relationship. If I were to implement it within the `Company` table, I would use the column names `PresidentPersonID` and `TreasurerPersonID` to preserve the *PersonID part of the name while adding the additional descriptor. `inner join Person as Presidents on Company.PresidentPersonID = Presidents.PersonID` – Malachi Apr 25 '12 at 16:28
4

I always use 'id' as the primary column name for every table simply because it's the convention of the frameworks I use (Ruby on Rails, CakePHP), so I don't have to override it all the time.

That won't beat academic reasons for me.

Sfynx
  • 101
  • 2
2

I don't think it's a bad practice if it's used properly. It's common to have an auto-incrementing ID field called "ID" that you never have to touch, and use a friendlier identifier for the application. It can be a little cumbersome to write code like from tableA a inner join tableB b on a.id = b.a_id but that code can be tucked away.

As a personal preference I tend to prefix the Id with the name of the entity, but I don't see a real issue with just using Id if it's handled entirely by the database.

Wayne Molina
  • 15,644
  • 10
  • 56
  • 87
  • You would never join two tables by eaches primary key especially an auto-increment id. The sample above would be from tableA a inner join tableB b on a.id = b.table_a_id. – Bill Leeper Oct 18 '11 at 15:51
  • Yeah, that's what I meant. Almost lunchtime and need energy. – Wayne Molina Oct 18 '11 at 15:54
2

ID is common enough, that I don't think it would confuse anyone. You're always going to want to know the table. Putting fields names in production code without including a table/alias is a bad practice. If you're overly concerned about being able to quickly type ad hoc queries, you're on your own.

Just hope no one develops a sql database where ID is a reserved word.

CREATE TABLE CAR (ID);

Takes care of the field name, primary key, and auto increments by 1 starting with 1 all in one nice little 2 character package. Oh, and I would have called it CARS but if we're going to save on key-strokes and who really thinks a table called CAR is going to only have one?

JeffO
  • 36,816
  • 2
  • 57
  • 124
  • 1
    this shows why knowledge of formal relational theory is important, the table name represents what a **single row** is. The table `Car` represents a `Table` where each `Row` represents a single `Car`. Calling in `Cars` changes the semantic and shows a complete lack of understanding of formal relational theory basic principals. Rails is a prime example of someone that *knew enough to be dangerous*. –  May 06 '15 at 19:44
2

This question has been beaten over and over again, but I thought that I too would add my opinion.

  1. I use id to mean that that is the identifier for each table, so when I join to a table and I need the primary key I automatically join to the primary key.

  2. The id field is an autoincrement, unsigned (meaning that I never have to set its value and it cannot be negative)

  3. For foreign keys, I use tablenameid (again a matter of style), but the primary key I join to is the id field of the table, so the consistency means I can always check queries easily

  4. id is short and sweet too

  5. Additional convention - use lower case for all table and column names, so no issues to be found due to case

2

Another thing to consider is that if the primary key name is different from the foreign key name, then it is not possible to use certain third party tools.

For example, you would be unable to load your schema into a tool like Visio and have it produce accurate ERD's.

Martijn Pieters
  • 14,499
  • 10
  • 57
  • 58
1earldog
  • 1
  • 1
  • That would be the third party tool's fault though. Column naming conventions are no substitute for actual foreign keys, which the tool should introspect. – Gerrat Jun 11 '19 at 19:49
1

I find people here cover pretty much every aspect but I want to add that "id" is not and should not be read as "identifier" it's more of an "index" and surely it does not state or describe the row's identity. (I may have used wrong wording here, please correct me if I did)

It's more or less how people read the table data and how they write their code. I personally and most likely this is the most popular way I see more frequently is that coders write the full reference as table.id, even if they don't need to do union's or/and joins. For example:

SELECT cars.color, cars.model FROM cars WHERE cars.id = <some_var>

That way you can translate it to English as "Give me color and model of that car that is numbered as ." and not as "Give me color and model of that car that is identified as number ." The ID does not represent the car in any way, it's only car's index, a serial number if you will. Just like when you want to take the third element from an array.

So to sum up what I wanted to add is that it's just a matter of preference and the described way of reading SQL is the most popular.

Though, there are some cases where this is not used, such as (a far more rare example) when the ID is a string that is really describing. For example id = "RedFordMustang1970" or something similar. I really hope I could explain this at least to get the idea.