1

I have a set of data (assume they are objects) with unique immutable names, like this:

class Datum {
    final string name
    // other fields
}

Considering that:

  1. I don't need to support rename. (The names are immutable as I mentioned)
  2. The names are unique. Among the data I store, there are no two data with the same name.
  3. There may or may not (depending on how future plugins will be using the API) be a lot of API calls to search a datum by name.

Should When should I index my data with an addition numeric ID, or rather than just by the name?

  • in databases?
    • The ID would be an auto_increment primary key from the database engine
    • The name is provided
  • at a runtime memory storage (most likely hashtable, but also other data structures that may be applicable)?
    • The ID is an increment value attached with each datum when a datum is loaded.
    • The data may be inserted into or removed fro the storage anytime with no specific order, and may be re-inserted (after being removed), but the same datum only occurs in the storage once at the same time.
    • Consider implementation in different languages. For example, in Java, a HashMap<String, ?> vs HashMap<Integer, ?> may be used. In PHP, the difference is the key used in the array that stores the data.

Note: The two questions are independent from each other, i.e. yes for database only but no for runtime only may also be a reasonable answer.

SOFe
  • 658
  • 1
  • 7
  • 27
  • IMHO, both are valid options. Using a numeric ID is often used, but by no means necessary. If names are unique and don't require renaming, you may as well go with it. – dagnelies Oct 10 '16 at 13:38
  • Btw, "search a datum by name" seems like a bad idea. Better convert the name into a "real" date and search with that. – dagnelies Oct 10 '16 at 13:42
  • 2
    @dagnelies A [datum](https://en.wikipedia.org/wiki/Datum) is not necessarily date or time related information. – Dan Pichelman Oct 10 '16 at 14:10
  • 2
    Possible duplicate of [Is there a canonical source supporting "all-surrogates"?](http://programmers.stackexchange.com/questions/204521/is-there-a-canonical-source-supporting-all-surrogates) –  Oct 10 '16 at 14:14
  • 1
    You can also use UUID. What dislike to me about auto incrementals is at the time to design web APIs. I don't feel confortable using this sort of ID as path vars. Seems to me a very easy design to exploit. This is why I started recently to use UUID. – Laiv Oct 10 '16 at 18:25
  • What's different between a UUID and a name if it's guaranteed immutable and unique? – SOFe Oct 11 '16 at 09:14
  • @Snowman this question is partly duplicate of the question you pointed out, but another half of the question is about runtime memory indexing, which is probably not related to that question. How can I edit this question to avoid duplicate? – SOFe Oct 11 '16 at 09:31
  • I trust way more in the UUID inmutability and unique than in any String of any data model, no matter how inmutable it seems to be. – Laiv Oct 11 '16 at 18:45

2 Answers2

1

The case you describe is so bizarre that I question whether it is a purely theoretical one.

Here are the facts as I understand them:

  1. You must have a unique Name independent of the need for a unique Identifier
  2. People will search for data by that Name, and the Names will be user entered (no auto-generated)
  3. You don’t care what the name is, Ie if a mistake is entered there is no need to fix it.

The primary advantage of using an auto-generated Key is the easy of creating new records. (Auto incremented or an Auto created Text ID is the same), you don’t have to check if a value already exist before inserting, you don’t have prompt the user for to enter one.

Now if you HAVE to check if a value already exist before inserting and you HAVE prompt the user for one. There is no need another Key, you can just use Name. But I strongly urge you to check your assumptions.

Morons
  • 14,674
  • 4
  • 37
  • 73
0

This makes sense to me. There are some problems where the business model offers up a natural "primary key".

If you can guarantee that your unique field is in fact unique and immutable, then you may as well use it.

Put another way, would adding a generated key improve your app?

Having said that, I agree with Morons - it's mighty rare to come across a situation where you can make that sort of ironclad guarantee.

One possible example might be if you had to store information related to a chessboard - there are 64 squares, with the rows numbered 1 - 8 and the columns labeled a - g. There will never be any more, and "a1" is guaranteed to be unique.

In business, it's tempting to use "part number" or "stock number" as an ID. Don't do it.

Dan Pichelman
  • 13,773
  • 8
  • 42
  • 73