4

I've been trying to figure out how to use HBase/Cassandra for a token system we're re-implementing. I can probably squeeze quite a lot more from MySQL, but it just seems it has come to clinging on to the wrong tool for the task just because we know it well. Eventually will hit a wall (like happened to us in other areas). Naturally I started looking into possible NoSQL solutions. The prominent ones (at least in terms of buzz) are HBase and Cassandra.

The story is more or less like this:

  1. A user can send a gift other users.
  2. Each gift has a list of recipients or is public in which case limited by number or expiration date
  3. For each gift sent we generate some token that uniquely identifies that gift.
  4. For each gift we track the list of potential recipients and their current status relating to that gift (accepted, declinded etc).
  5. A user can request to see all his currently pending gifts
  6. A can request a list of users he has sent a gift to today (used to limit number of gifts sent)
  7. Required the ability to "dump" or "ignore" expired gifts (x day old gifts are considered expired)

There are some other requirements but I believe the above covers the essentials.

How would I go and model that using HBase or Cassandra?


Well, the wall was performance. A few 10s of millions of records per day over 2 tables kept for 2 weeks (wish I could have kept it for more but there was no way). The response times kept getting slower and slower until eventually we had to start cutting down number of days we kept data. Caching helps here but it's not an ideal solution since a big part of the ops are updates.

Also, as I hinted in my original post. We use MySQL extensively. We know exactly what it can and can't do both in naive implementations followed by native partitioning and finally by horizontally sharding our dataset on the application level to reside on multiple DB nodes. It can be done, but that's not really what I'm trying to get from this. I asked a very specific question about designing a solution using a NoSQL solution since it's very hard to find examples for designs out there.

Brainlag, not trying to come off as rude. I actually appreciate it a lot that you are the only one who even bothered to respond. but I see it over and over again. People ask questions and others assume they have no idea what they're talking about and give an irrelevant answer. Ignore RDBMS please. The question is about nosql.

  • Sounds like a perfect use case for MySQL to me. What is the real problem with MySQL? "hit a wall" could mean everthing – Brainlag Sep 10 '11 at 22:13
  • I don't want to sound rude either, but apparently you don't know everything about MySQL. Use another storage engine, one such as TokuDB. I've had excellent performance with billions of rows. This *is* a job that fits MySQl. It's the approach that's wrong. If you still want to use NoSQL because it's hot topic and every script kiddie on the block uses it - well, have fun. Especially when you start losing records for no reason. And you will. – Mjh Feb 09 '12 at 10:24

2 Answers2

1

Part of the issue of NoSQL databases is that they are newish products with very variable feature sets and useage styles. As a result there is no one right way to use a NoSQL database.

I have some familularity with Riak, and am of the view that it would support the feature set you are talking about, including expiring your expired gifts. and supports the ability to grow request capacity nearly linearly by adding new servers.

Just one issue, On a collegues project, with some high data expiry requirements (think 100's per second), this was an issue with some NoSQL dbs in that deletes were just too slow, so model and evaluate before becomming totally commited.

Michael Shaw
  • 9,915
  • 1
  • 23
  • 36
0

Due to the lack of answers for the two DBs you mentioned I'd like to suggest you have a look at MongoDB or Redis. Both are extremely fast and modeling your data to fit either system should be fairly simple.

T3hc13h
  • 101
  • 2