9

Lets say i have an application where user can register, and the username has to be unqiue value.

Now lets say i have N partitions and for each partition i have M replicas with multiple leaders.

Now i have questions regarding these scenarios:

First:

  1. User 1 attempts to register with username user1 - the write request gets routed to partition1 and to leader1

  2. User 2 attempts to register with username user1 - the write request gets routed to the same partition1 and also to the leader1

In this scenario the behavior is same as we had just one database. First transaction occures and the second one fails since the user1 value is already here and we are operating on the same replika

Second:

  1. User 1 attempts to register with username user1 - the write request gets routed to partition1 and to leader1
  2. User 2 attempts to register with username user1 - the write request gets routed to the same partition1 and to leader2

In this case we have concurrent write. How does this determine what registration fails and what not? We can look at this as no partition and multiple leader and as far as i researched in this case the typical solution is to either 1) prevent this by doing first scenario or 2) merge the values which is not acceptable in this case. Or solve conflicts on application level that is also not acceptable. How do DB's deal with this ?

Third:

  1. User 1 attempts to register with username user1 - the write request gets routed to partition1 and to leader1

  2. User 2 attempts to register with username user1 - the write request gets routed to the same partition2 and to leader3

In this case all writes go to different partitions ( what makes sense to me that this will probably not happen in real life since they have same value and thus should be routed to one partition ). How would the DB resolve what registration would succeed and which one would fail? How would it lock stuff or check if the value exists and so on?

The more i read about distributed DB's and how it works (even on high level ) im more and more confused.

Thanks for answers!

Johnyb
  • 339
  • 2
  • 6
  • 4
    You might want to read up on the [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem). If you want to ensure consistency you have to sacrifice either availability or partition tolerance. It is not possible to get all three. – JonasH May 11 '23 at 13:52
  • The wiki page on [distributed transactions](https://en.wikipedia.org/wiki/Distributed_transaction) mentions "strong strict two phase locking" that could be used to ensure only a single machine succeeds at creating the user, ofc, you loose many of the benefits of a distributed system if you need a global lock. – JonasH May 11 '23 at 14:00
  • @JonasH i read about CAP theorem but its so abstract it does not answer any of my questions, if i dont use distrbitued transactions how would the system behave or how would it work to achieve wanted result? Your comment indicate there has to be other way since this is the most basic example and it would be weird if we lost all benefit of distributed system due to it – Johnyb May 11 '23 at 14:09
  • 1
    I'm no expert on databases. But I think the core idea is that you either accept write conflicts, or lose most of the benefits of distribution by using locks. – JonasH May 11 '23 at 14:12
  • 4
    I don't think this question can be answered in general. You need to choose some transactional model. It is key-value, SQL ACID transactions, Calvin transactions, something else? Different things will need to happen in each case. – Javier May 11 '23 at 14:25
  • 2
    @Johnyb theorems like CAP (and halting problem for example) are useful because they tell you what you *can't* do, then instead of wasting time trying to do that thing, you get to choose which thing that's actually possible you want to do instead. On the other hand, people like to talk about these theorems a lot more than is actually useful because it makes them sound smart, which is why it doesn't answer your question. – user253751 May 12 '23 at 12:51
  • @user253751 and the more i read/know about this stuff the more questions i have :/ – Johnyb May 12 '23 at 13:23

3 Answers3

12

As mentioned in comments, there's no single answer for this due to the CAP Theorem. There are trade-offs inherent to any approach you take.

One model for solving this problem is to use the key value to direct where the primary storage of the record is found. For example you could take a hash-digest of an email address, mod it by the number of storage locations, and that result tells you which will be the primary location for the record. Then when two registrations attempt to use the same email, both registrations will be put in the same place where you can enforce your uniqueness. This is basically the dynamo model. The downside is that if that node is unavailable or unreachable for any reason, you cannot register that email at all.

Towards the other end of the spectrum, you have the notion of eventual consistency. That is, applications work with instances that they can reach/prefer and at some point later, the storage nodes must be reconciled. This issue here is implied by your question: it's possible there could be conflicts and there needs to be some way of resolving them post facto.

There are many other possible models for this each with their own downsides. You can have each node inform the other nodes of a new key and only commit once every node agrees. The challenge there is that if you have nodes that cannot reach each other, the whole system becomes unable to record new transactions in a naive implementation. One workaround for that is to use the concept of a quorum to allow at least some part of the distributed system to remain operational. When the nodes that were disconnected re-enter the conversation, they need to first accept all the changes that they missed.

Unfortunately, this is not a situation where you can have your cake and eat it too.

JimmyJames
  • 24,682
  • 2
  • 50
  • 92
  • 1
    Another downside of the dynamo model is that you can't have more than one unique constraint per collection. – Philipp May 12 '23 at 10:49
  • A quorum would probably be a preferred solution for a scenario where one needs multiple leaders. Wait for a consensus on the transaction before you mark the registration as successful and you are safe. – Falco May 12 '23 at 11:58
  • +1. But, I'd suggest adding "intrinsically/externally" and "statistically" enforced constraints to the list. E.g., An intrinsically/externally enforced constraint could be a verified email address. If it exists in N partitions, it refers to the same thing and can be just be merged. Statistical would be a combination of prefixing and/or randomness on unique fields. These can even be added to user-provided data. – svidgen May 12 '23 at 15:44
  • Might also be worth mentioning that for all other non-identifying data, you often *don't* try to enforce constraints. You just assume uniqueness isn't enforced and you write business rules/logic that interpret and/or correct collisions predictably. – svidgen May 12 '23 at 15:45
  • @svidgen Thanks for the feedback, but it's not totally clear to me what you mean by 'list' here. I didn't intend to make this comprehensive of all possible distributed models but more an intro to the concepts with a few practical examples. – JimmyJames May 12 '23 at 16:15
7

If a given key is capable of being allocated in more than one place, then the answer to how uniqueness is enforced at all times is: you don't.

Depending on your needs, it may be possible to design the system so that no two places could ever allocate the same key (for example by using a prefix for each allocator).

However, where the user supplies the key themselves - such as an email address - there is obviously no prospect of prefixing.

There are really only two kinds of systems. Non-distributed systems which provide stronger guarantees, and distributed systems which occasionally require additional intervention and human labour to resolve conflicts and reset order.

The constraints you've specified where the database must be distributed but also must be consistent at all times (including each partition having a consistent view of what keys have been allocated and which haven't), are irreconcilable constraints.

Steve
  • 6,998
  • 1
  • 14
  • 24
  • 4
    "The constraints you've specified where the database must be distributed but also must be consistent at all times, are irreconcilable constraints." – Or phrased in terms of the CAP theorem: if you want consistency and partition tolerance, you have to compromise on availability. – Jörg W Mittag May 11 '23 at 15:42
  • @Steve in that case it means that for example registration in such distributed db system should be handled by certain partition with single leader multiplication? Or is would applications have dedicated database for registration that would also have single leader multiplcation/dedicated leader for registration? – Johnyb May 11 '23 at 18:53
  • "distributed systems which occasionally require additional intervention and human labour to resolve conflicts and reset order" - That's an interesting claim. I'd be curious for an example. Also interested in an example of a non-distributed system that *doesn't* "occasionally require additional intervention and human labor to resolve conflicts [and reset order]". I'm under the impression that *every* "system" requires some level of babysitting, correction, cleaning, etc.. – svidgen May 11 '23 at 19:46
  • 2
    @svidgen If we are talking about 'any system' a properly designed table in a single-instance DB with unique constraint should never allow two of the same values. I don't think the answer is talking about mistakes. I understand the answer to mean that in the normal, expected, happy-path execution of an eventually consistent system, you should expect some level of manual mitigation. – JimmyJames May 11 '23 at 20:05
  • 1
    @JimmyJames Sure. All systems require human intervention to account for "mistakes", and that's partly why I'd prod for clarification here. But, regardless of whether a system is "distributed", the potential for conflicts exists. It's not a question of whether constraints are enforced and conflicts are rejected. It's just a question of *when* and *who*. **When** does the system ask **who** to resolve the conflict. – svidgen May 11 '23 at 20:13
  • @svidgen So in the very basic dynamo-type model, if you were to use the field in question as a partition key, I expect that (disregarding errors) you would never have to resolve duplicates. – JimmyJames May 11 '23 at 20:17
  • @Johnyb, I'm not familiar with the term "single leader multiplication", but if you design a system where any given function (like registration) is handled only by a specific partition, then that function is effectively non-distributed. It's possible to design useful systems that have a mix, but there is additional complexity, and when done in a ham-fisted way the system can end up with considerable unnecessary complexity - with the trappings of a distributed system, but only the performance and availability of a non-distributed system. – Steve May 12 '23 at 05:08
  • 1
    @svidgen, the example I had in mind is where the same username is registered twice, once in each partition, but not with identical details (such as the password). Reviewing the differences and liaising with the customer/user would be the only sensible approach (as opposed to applying any kind of mechanical resolution). Also I wasn't talking about the need for labour for "housekeeping" in general - I was talking in context about data conflicts and resolution techniques. – Steve May 12 '23 at 05:16
  • 1
    @svidgen, also just to be clearer about underlying reasons why manual resolution is required, it's because there will be additional implied constraints in the system. It's necessary not just for the password (for example) to be consistent across partitions, but also for the customer's own records to be consistent with what is stored in the database. The details in two places may also be so obviously different (to the human eye) that the implication is that you're not dealing with the same customer in both places, but at least one customer who has registered under the wrong email/username. – Steve May 12 '23 at 05:22
4

All respect to the CAP theorem but actually for this use case there is a scheme that would give you uniqueness, consistency, partition tolerance, and availability. It's called a namespace.

If someone attempts to register as user1 on partition1 with leader1 they actually become user1@partition1.leader1. The thing about uniqueness is absolutely nothing is unique except within some scope. Namespaces impose a scope in which you can enforce uniqueness.

Now sure, 99% of the time there will be no collision and it's easier to just call them user1. But that's a nick name. It's not their full name. With the scope imposed by the namespace you're free to add new partitions, leaders and users to the existing system whenever you like.

With this solution the rest of the system isn't needed when adding a user. If your design can achieve that you avoid the whole CAP theorem issue altogether.

Oh, an if you don't like exposing details like partition1 and leader1 you can turn them into unique numeral codes like Social Security does.

Another example of avoiding CAP is how online shopping carts work. You want to buy 10 widgets from a store that has 100's of web servers. But you're only on one web server. How does that one web server know there are 10 widgets left to sell without checking all the other orders coming in from all the other web servers? Simple, the web server you are on had 15 widgets reserved just for it to sell. None of the other web servers even knew about them.

Study the CAP theorem. It's a sound theory. But understand that it's a trap that you're almost always better off finding a way to avoid rather than succumbing to the idea that you can never have all 3 on a distributed system.

candied_orange
  • 102,279
  • 24
  • 197
  • 315
  • 1
    but if we used `user1@partition1.leader1` example, how would user know it? Lets say he register as `user1`, would we have te tell him that next time he signs in he has to use `user1@partition1.leader1` or how would we route him to that same partition/leader? if we used just `user1`? Also, what if we rebalance our nodes? That data can be changed to different partition in that case. The more i read/know about these distributed systems, the more questions i have and its harder and harder to find any answers to them :( – Johnyb May 12 '23 at 13:22
  • @Johnyb how would user know it? You can tell the human. You can hide it in a cookie. If you want classic AP (eventual consistency) once all current nodes have signed off you can promote them to `user1` and then they can sign in from other devices. But they will always be user1@partition1.leader1 under the hood. Even if you eventually have to tell them that someone else has user1 and they have to pick something else if they really want a short name. Or explain nothing about what went wrong and just act like they never registered. As for routing, who cares what registered your user name? – candied_orange May 12 '23 at 13:39
  • 1
    @Johnyb Yes, this is why people don't just use distributed systems for everything but only when they _have_ to. Like the reason why we don't use rocket ships to travel to the grocery store. You know when you _have_ to use one because there won't be any other (easier) solution, even when you've spent quite a while working on the problem trying anything to avoid it being distributed. – user3067860 May 12 '23 at 14:36
  • Normal solution: Pros: Easy to use, readily available, cheap, pretty safe, works really well for most cases. Cons: Slower, not cool. Edge case solution: Pros: Cool, looks good on resume, can be more powerful than the normal solution. Cons: Expensive, hard to use/understand/implement/maintain, failure method can be really bad, still has limitations--just different limitations. – user3067860 May 12 '23 at 14:57
  • @user3067860 Maybe no rocket ships to get to the grocery store but what about rockets that deliver the groceries to your home? What could go wrong? – JimmyJames May 15 '23 at 21:17
  • @JimmyJames I think they have a name for those..."missiles". – user3067860 May 16 '23 at 13:12