3

After reading some questions about the probability of UUID collisions it seems like collisions although unlikely, are still possible and a conflict solution is still needed. Therefore I am wondering about the background of choosing UUIDs for CouchDB

  • Is the "unlikely collision" a responsibility of the developer?

  • Was it expected that IDs will be used by a reduced set of clients?

When I went through the documentation it looked like CouchDB algorithm was great to withstand partition, but the more I read about the problems of distributed ID generation, the more I believe taking the UUID collision risk is only feasible with a low number of clients.

Although I am still interested in the previous questions, the main thing I want to find out is:

  • Is it the normal practice accepting the collision risk of UUIDs counting on a low number of distributed generators? Or always assumed that the probability of collision is so low that is not a concern?
SystematicFrank
  • 897
  • 8
  • 18

2 Answers2

4

The whole point of UUIDs is that the risk of collisions can be safely ignored. A conflict solution is not needed.

If you look at your log files and see a message "Fatal Error: UUID collision detected" then you can bet that the message is due to a bug in some code, and not due to a UUID collision.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
1

I've just got very little experience with CouchDB, but there is a suffix you can configure for the utc_id algorithm, starting in version 1.3 utc_id_suffix

In general, I trust random generators when they have a unique origin (e.g. one instance of CouchDB), but not really when they're being generated from different sources and with the same algorithm and format (e.g. replicating instances of CouchDB). This also applies to processes and threads depending on the scope. With the exception of having a mechanism to differentiate specifically each random sequence origin like the suffix above.

FranMowinckel
  • 215
  • 1
  • 5