3

We have an occasionally connected, event sourced system: a budgeting application where each client can pull other client's event store from a separate storage (network share, dropbox,...).

It is our idea to assign each event the current client's (incremented) vector clock so that we can determine a global order of events when pulling the events from other clients.

However, order via vector clocks is only guaranteed if there were no concurrent updates.

We thought about adding a local timestamp to the vector clock which is to be used to resolve such conflicts. The resulting order would be the same on each client because the timestamps are generated at the time of the event itself.

If two timestamps are identical, we'd fall back on ordering via client id.

Is this approach sane and safe enough?

Additional information

We support multiple clients (i.e. devices). Each client maintains a local event store. Clients can publish their event store, at which point it becomes available for other clients to pull. Example: Client A saves its latest version to a folder in Dropbox - other clients may then read the persisted information.

When pulling the event stores from other clients, we have to merge them into our own store and reconstruct the global order of events (concurrency is last wins). This means that each client, over time, will share the same list of events (eventual consistency).

To get a global order, we want to use vector clocks. Each event has a unique vector clock of that point in time associated with it. Events are created by any client and initially are stored only in that client's local event store. Adding new events to the event store increments that client's vector clock.

We will have events such as Transaction added to account, Transaction description modified, Budgeted amount adjusted etc. Each new client will be a fork of the event store of some other client, possibly merged with other client's event stores. After that, it can add its own events concurrently while other clients are active.

One such use case is: I'm out shopping and enter a transaction in my phone while somebody else at home is adjusting the planned budget and clearing transactions according to our bank account statement. My transactions may show up once the client at home pulled my latest updates, but both clients continue to operate in a somewhat disconnected fashion with concurrent updates on both clients.

The concurrent updates should not be a problem with an event sourced architecture, since we can determine the full state of the application by replaying all events in order. We will only see some conflicts if multiple clients modify the same aggregates or resulting read models - in which case: last wins and try talking to each other before working on the same things.

urbanhusky
  • 131
  • 5
  • I don't think we could answer this without more information, such as how you might use the total ordering. Vector clocks are supposed detect when a change is being generated with stale information. As distributed system protocols are really hard to get right, you might model your system (e.g. with [TLA+](https://en.wikipedia.org/wiki/TLA%2B)) to help look for issues. – Erik Eidt Jul 21 '16 at 18:14
  • 1
    I've added additional information about what we do and how we plan to do our pull-based replication. – urbanhusky Jul 21 '16 at 21:53

0 Answers0