15

Possible Duplicate:
When would someone use MongoDB (or similar) over traditional RDMS?

How well the SQL and NoSQL go head-to-head. I read somewhere that SQL databases are not well for data that is well structured or has some graph-iness associated with it. Is it really the case?

Apart form Facebook, Google, and some other big players on the web, I don't know how well small-players and start-ups have used these tools. I found another similar question about the same here. But I couldn't gather much stats from here. These are certain specific cases, and is there a general pattern (like the one mentioned above) for which these NoSQL databases can be used?

How wise will it be for a start-up to go for a NoSQL database if the developers know that the amount of data involved will be a little large and is well structured, but requires frequent CRUD operations?

Here on stackoverflow, one can find questions about when not to use SQL, but are there any scenarios when one should avoid using NoSQL databases? Also, how effective will it be to use both in parallel so that we get the best of them both?

One last question, do these distributed NoSQL databases perform equally well when they are used in a single-node setup?

c0da
  • 1,526
  • 3
  • 12
  • 20
  • Hmm... so SQL according to your statement is ill suited for highly structured data, like a general ledger or an order entry system?? I don't think this statement is quite what you meant.... – marc_s Sep 19 '11 at 05:16
  • Like I mentioned in the question, I read it somewhere that SQL is not useful for data that follows a particular structure or has some graphiness associated with it. Does it mean that the NoSQL databases are able to handle the case better?? –  Sep 19 '11 at 05:21
  • 6
    Um, SQL is all about structured data, tables and foreign keys look pretty structured to me. And a database that supports `WITH RECURSIVE` takes care of many of the "graphiness" (or "tree-iness") issues. – mu is too short Sep 19 '11 at 05:25
  • @muistooshort: Still, it will be to slow comparing to neo4j's capeability to iterate 1 milion graph nodes in one second. – the_drow Sep 19 '11 at 05:49
  • 1
    @the_drow: I did say *many*, not *all*. There are probably things that a relational database can do without breaking a sweat that neo4j can't. Right tool, right job, etc. – mu is too short Sep 19 '11 at 05:54
  • My this question may help- http://programmers.stackexchange.com/q/84477/963 – Gulshan Sep 19 '11 at 06:17
  • 1
    MongoDB is web scale: http://www.youtube.com/watch?v=b2F-DItXtZs –  Sep 19 '11 at 08:01
  • @c0da: I think you need to reformulate: what you probably mean to say is that SQL databases are not well-suited to data with certain *types* of structure - in particular, "graphy" data in which you need to do a lot of path walking. – reinierpost Sep 19 '11 at 08:40

2 Answers2

8

Here are some pros for classical SQL databases like Oracle, MS SQL, Sybase, DB/2 against NO/SQL:

  • very mature
  • existing ecosystem (documentation, tools, lots of third party vendors, bindings to different programming languages etc.)
  • extensive security models
  • easy ad-hoc-queries, with very powerful query languages
  • most of them are scalable for hundreds of users/connections
  • Views and stored procedures
  • trustworthy transaction models
  • different indexing capapatibilities
  • different goodies like replication mechanisms, geographic datatypes, etc.

Those features are really good if you are going to develop some enterprise software system, where you don't know exactly what applications will work with these data in a few years. I don't think there are any No/SQL databases available today which can compete in the listed features with any of the "big" SQL databases.

On the other hand, NO/SQL-databases show benefits when you don't need many of those things, for example, as a backend db of a web application you won't need a complex security model, because there may be mostly just one user (the application server process) connected to you db at production time. Of course, you could also use some lightweight SQL database like SQLlite for those kind of applications (and many people do this), but then there's also to consider the structure of the data you are going to store. For example, a blog system, a QA forum, a content management system, a Wiki: there is a lot of text or html code or xml code to store, and less adminstrative data. That may be systems where NO/SQL can show it's benefits.

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
  • What if the amount of data involved is huge? Would SQL not slow down? – c0da Sep 19 '11 at 07:26
  • I mean what if the data sets are large? – c0da Sep 19 '11 at 07:48
  • 2
    @c0da: define "large". I have seen relational databases with more than 100 Million (small) records in a single table, where the indexing capabilities of the DB helped a lot to manage that data. Of course, these systems needed a lot of finetuning and DB specific performance optimization to work well. On the other hand, if you have a lot of unstructured or weakly structured data (like image data, data from scientific measurement, large amount of free text etc)., SQL databases may not be able to play out their strengths. – Doc Brown Sep 19 '11 at 08:34
7

There is only one scenario where NoSQL is the way to go. And that is when you need to be very very flexible regarding database traffic.

That means you have to expect that all of a sudden the database-requests explode by a factor of - let's say - 10 or 100. The ability to react to such situations is called elasticity.

Here you have a problem with classic relational databases, cause due to their specialization on optimizing for JOINs they cannot just be copied to another server while still guaranteeing relational integrity. And even then copying those huge databases takes a lot of time.

So to deal with those explosion of request situations companies came up with different database-concepts that are specialized for that.

If you don't have to expect such extreme flucutations in your request-load, then relational is the way to go. And that maybe forever, cause what many people don't understand is that relationality is not just some tech-buzz-word from last century - information is almost always inherently of vectorial and therefore relational nature.

Raffael
  • 505
  • 3
  • 14
  • 1
    This is not the most common case for NoSQL. The most common case is *availability* e.g. the database should always be available, that means that you need a distributed multi-master database. And that is something that traditional RDBMS's is not very good on at the moment. – Jonas Sep 19 '11 at 09:41
  • 6
    a relational DB is not good at being available? I don't even know what that is supposed to mean ... – Raffael Sep 19 '11 at 10:03
  • Exactly, a traditional relational database is mostly not as available as a NoSQL system. This is the reason why Google and Amazon started to develop their own NoSQL systems. Their business is so important so downtime is not acceptable. – Jonas Sep 19 '11 at 10:05
  • well then availabilty is just another word for what I called elasticity ... cause a DB becomes unavailable when it's not appropriately scaled and that is what elasticity is about ... jit scaling without having to set up a server system that is overdimensioned 90% of the time. – Raffael Sep 19 '11 at 10:08
  • 1
    No, that is a different thing. A database can be scalable, but still not have high availability. – Jonas Sep 19 '11 at 10:13
  • why should a DB be not available? this happens when traffic exceeds the capacity either b/c of to many CRUD-ops or data per time unit. I get the impression you just don't know what you are talking about. – Raffael Sep 19 '11 at 10:16
  • A small database with only say 100 records on a single computer **is** scaled appropriately, but it is not highly available since computers fails sometimes and need to be repaired, so the database need to be placed on multiple computers. – Jonas Sep 19 '11 at 10:17
  • 3
    scaling is referring to the requests as well. A small database with only 1 recorde on one server is not scaled when you have to expect that a thousand people request that record every second. You are hiding behind buzz-words and you made it your mission to just contradict on this site. Bye – Raffael Sep 19 '11 at 10:22
  • I know what I'm talking about. Computers do not run forever without downtime. – Jonas Sep 19 '11 at 10:22
  • A few thousands of people every seconds is not a problem for a database with caching. But lets talk only about 100 request per second and 100 records in the database. Availability is still the most important thing to drive your business, you don't want to close your shop at the best time of shopping. – Jonas Sep 19 '11 at 10:26