4

I am considering using a NoSQL Document database as a messaging queue.

Here is why:

  • I want a client to post a message (some serialized object) to the server and not have to wait for a synchronous response.
  • I want to be able to pull messages off of the "queue" based on some criteria, which may be more sophisticated than just a priority level (I am working on a hosted web app, so I want to give all of my customers a fair amount of "computing time", and not let one customer hog all of the processing).
  • I want the queue to be durable - if the server goes down I want any remaining messages to be handled when it comes back up.

So, I am considering using MongoDB or RavenDB as a message queue. The client can post the message object to a web service which writes it to the database. Then - the service doing the work can pull the various message types based on any criteria that may arise. I can create indexes around the scenarios to make it faster.

So - I am looking for someone to shoot a hole in this. Has anybody successfully done this? Has anybody tried this and failed in some way?

MattW
  • 879
  • 2
  • 10
  • 16
  • 2
    You could build a message queue on top of a database but you'd end up rewriting a bunch of logic that's already implemented inside of an existing message queue application. What is your aversion to using one? – Sean McSomething Mar 08 '13 at 00:12
  • The aversion currently is: I would like to pull messages off of the queue in some intelligent fashion - not just whatever is next (I know - that's the point of a "queue"). I was thinking of taking them off the traditional "queue", putting them in some DB, and then polling that DB to intelligently get the "next" message to process. – MattW Mar 08 '13 at 18:05

3 Answers3

3

Check out the accepted answer on this question: https://stackoverflow.com/questions/4745911/nosql-databases

My take on having worked with both types of databases is that the real advantages of NoSQL lie in their scalability. They are well suited for ever-growing blobs of stuff that needs to exist on many nodes. After all, these are the applications that they were born out of (Facebook, Google...).

They have downsides also, and they are specific to implementation. Personally, I've suffered with some replication errors when multiple nodes would delete and re-populate objects within a short amount of time. I'm not necessarily suggesting that it is always pervasive, but the speed advantage often comes with less guarantees of consistency (ie, you will have eventual consistency, but you don't want to depend on it).

If all you're doing is building a queue, then I don't see anything specific to NoSQL that makes them a preferred choice. The speed/reliability/efficiency of it will come down more to the configuration of whatever implementation it is that you decide to go with.

MrFox
  • 3,398
  • 2
  • 19
  • 23
2

There are no visible holes in it as your requirements list is pretty short :-). Basically the longer the requirements list the bigger the chances to find holes in writing your own.

In my opinion, using a NoSQL database for this scenario would fit:

  1. if the requirements are not for a full featured queue
  2. if the app will not have to move from the pull model to a push model (queue v pub/sub)
  3. the structure of the messages is pretty variable and changes over time
  4. the app needs to pull messages based on different criteria
  5. reusing the NoSQL database would reduce the number of systems the app would depend on

As a side note, I'd (biasedly) encourage you to also take a look at RethinkDB.

Alex Popescu
  • 121
  • 2
1

I agree with MrFox. You need to consider that if you have several threads updating the same data in the queue, your database must support true ACID transaction or you will risk the same item in the queue to be processed more than once, besides getting duplicates of the same item in the queue.

If the total data that you post to the queue is not in big data size (>PB) of data I would advice on selecting another type of database, at least a database that supports true consistency.

The process queue will be better suited for an OLTP type of database since you are basically doing more inserts/updates rather than anything else.

  • 2
    Many NoSQL databases do provide ACID guarantee within a single document/aggregate. That may be sufficient for many use cases. Also, an ACID compliant database alone can't prevent you from making duplicate transaction and sending the messages reliably. You either need some idempotence guarantee from the remote server or they need to have an XA transaction semantic. Without either you're doomed to have some possible duplicates or missing messages. – Lie Ryan Jul 04 '14 at 21:09