-1

I'm designing a system in which a key component acts like a "journal", in the sense that it stores a list of sequential events that can change the state of the entire system. Upon request, any other component in the system can ask this component for the full list of events that happened since event XXX.

One can think of the sequential event ID as the key, and of the event itself (a JSON-encoded object of a few kb) as the value.

The system needs to keep a history of a couple of days worth of data, or alternatively, a few million records, but may need to scale to 10s of millions. Past events should be easy to automatically remove from storage.

Preferably, it should be easy to scale this storage to multiple nodes using master-master or even master-replica replication.

Any suggestions for such storage would be helpful.

shevron
  • 561
  • 1
  • 5
  • 5

1 Answers1

0

Some quality considerations where matter, that you didn't make clear:

  • how important is it to preserve data (if you restart service, is it OK if it looses data?)

Several plausible approaches that jump to mind:

  • Keep data in memory (easiest to implement, and best performing. Just use machine with a lot of memory. (loses data when you restart service)

  • mongodb This is a very good general purpose, very fast approach. Here also you can use any no-sql db.

  • filesystem store Just store the json blob with all your data in a file named by the key. This is moderately slow to update, but maybe fast enough for your purpose and is very easy to implement, scales pretty well, and the disk can be shared across machines (smb).

  • relational DB (similar to mongodb, but generally more expensive - just a matter of preference)

In all these cases, you can delete old elements for your store with a very low priority background task (maybe if using DB, keep index on date-used-for-expiry).

Lewis Pringle
  • 2,935
  • 1
  • 9
  • 15