17

I'm a SQL guy, but I know there is Not Only SQL databases - document-database mostly. As with most technologies there are pro and cons for each technology.

I've read some articles, but they were too theorical. What I would like is two real cases :

  1. when a switch from relational- to document-database gave an improvement
  2. when a switch from document- to relational-database gave an improvement

Improvement being any thing that makes better programs - less developpement time, scalabilty, performance, anything that is programming related. There is a caveat for 2. : stories like " falling back to relational database because everybody knows SQL" is not good

Johan Buret
  • 281
  • 1
  • 2
  • 10
  • 8
    Wrong approach. It's not about "performance" or "scalability". It's about which model fits the problem you're trying to solve. You might want to update your question to allow for the idea that perhaps the relational database isn't a good fit for numerous kinds of problems. – S.Lott Jul 08 '11 at 09:57
  • 3
    @S.Lott, the choice is often very much one of performance. consider that any relational DB can be used as a simple document DB -- only the performance would be a distinguishing characteristic. – edA-qa mort-ora-y Jul 08 '11 at 10:01
  • I have reworded my question so that it's not loaded in any way. – Johan Buret Jul 08 '11 at 10:07
  • 2
    @edA-qa mort-ora-y: "any relational DB can be used as a simple document DB". That must be false or folks wouldn't have invented an alternative. "only the performance would be a distinguishing characteristic". Only true if you assume that the relational model does everything equally well. If it did everything, there would be no alternative. Yet. We have alternatives. There are many problems (like hierarchies) which don't fit the relational model **perfectly well**, and require clever tricks. Or an alternative data model. – S.Lott Jul 08 '11 at 10:27
  • "read some articles"? Please provide some links or titles or references or quotes. We don't know what "too theorical" means to you. – S.Lott Jul 08 '11 at 10:40

3 Answers3

17

The main reason for choosing a NoSQL database the last years have been Availability. For companies like Amazon, Google and Facebook an hour of downtime or so isn't acceptable. To achieve high availability you need to reduce single-point-of-failure, that means you need to use a distributed system with multiple computers in case a computer crashes, the service is still available.

Traditional Relatione databases isn't very good in a distributed multi-master setup. That's why NoSQL has been so popular lately. So if you need high availability you may choose a NoSQL database like Riak, Cassandra, HBase, S3 or BigTable.

There is a good blog post about Amazon's Dynamo that is a good introduction to distributed NoSQL databases.

Now, the NoSQL term is very broad so there are many NoSQL databases that aren't distributed. But they solve other problems. E.g. Neo4j - a graph database are good on a type of queries that traditional RDBMS aren't optimized for. Or as in your case a document database, where you don't have to change the schema if you want to add some fields for some documents. In other words a document database is good when most posts (documents) has different fields so a relational table with predefined columns isn't usable.

However, most of the NoSQL databases are not as flexible as traditional RDBMS databases are, so it's a good choice to use a traditional RDBMS database until it can't solve your problems anymore.

Jonas
  • 14,867
  • 9
  • 69
  • 102
13

I have a simple approach to determine the database that best fits the data.

I just ask myself: Assuming I'd have no database, would I rather save the most and the important data as document or would I store them in a spreadsheet.

When the answer is "Spreadsheet", this is a clear sign that a relational model and a traditional RDBMS best suits the tasks most of the times. If the data are really simple, like only key value pairs or simple tables and referential integrity is not a topic, then a NoSQL database is probably best suited for the task and might boost performance quite a lot!

Also, when you cannot find a common structure at all, a NoSQL database is best suited for the task.

When the data are more document-like, e.g. hierarchically structured textual data without clear relations, then I immediatelly think of an XML-Database, which easily lets you store hierarchical structured documents. Sometimes it's best to use a document-management software, though.

So, in order to give a concrete and simple answer to both of your questions: It depends on the data.

when a switch from relational- to document-database gave an improvement

When you need to persist hierarchically structured textual data, an Xml-Database can be a big improvement in terms of maintainability and probably also scalability.

when a switch from document- to relational-database gave an improvement

Well, for example when the data are mostly in table-like form with clear relations and you need to guarantee integrity.

Falcon
  • 19,248
  • 4
  • 78
  • 93
10

We had to give up on the relational model because the data we were getting had no simple, obvious, fixed, static schema.

The users -- and the user stories -- did not have a fixed, static schema.

We tried to impose a fixed, static, RDBMS schema, but it was a mistake.

Each 3rd party data delivery (from customers and from vendors) was similar, but not identical. We tried mapping it to a fixed relational schema, but the variability was too great. We either had to add fields with every file (several each week) or we had to step away from the fixed, static relational schema.

If we viewed each record as a "document" with a common subset of elements and a unique (as well as ill-defined) collection of additional data elements, we were much, much happier.

The ill-defined collection of data elements is what the users actually needed for their use cases.

The fixed, static schema of the relational model did not fit our use cases.

S.Lott
  • 45,264
  • 6
  • 90
  • 154
  • I have seen other projects fail to meet requirements because of exactly the requirements you have described. This is what document databases were meant for. – maple_shaft Jul 08 '11 at 12:32