Benchmarking databases

Question

I see lots of discussions flying around about performance of db 'x' or that moving from 'x' to 'y' improved our site performance.

I'm yet to see proper benchmarking that works across different types of databases.

Is it possible to write a meaningful benchmark that could be used across multiple db types, such as Relational, Document-oriented, etc.
How would you go about designing such a benchmark?

As an example of the level of detail I'd require to take any database benchmark seriously take a look at this [paper](http://research.yahoo.com/node/3202), by Yahoo Research. I don't really have a good answer for you, other that I also suspect the CAP compromises and assymetries are the main reason that benchmarking databases is so damn hard. — yannis, Dec 31 '11 at 15:23

Arseni Mourzenko · Accepted Answer · 2012-01-10T10:18:27.710

Short answer

Yes, you can write a meaningful benchmark of a studied case, if you do it with care, and understand that if it's relevant for the particular case, it may not be for other cases. This is equally true when comparing the databases of the same type (relational database vs. another relational database) or the databases of different types.

No, you cannot write a benchmark which will magically prove that a specific database is ways better than another in every case, for every application.

Long answer

It's definitely possible to say that "moving from a database to another improved our site performance".

You measure the performance of the previous database through profiling or runtime statistics by gathering enough information about the queries and how fast they are.
You move the application to the new database.
You do the same measures.
You compare.

For example, if the full list of 3 182 432 products loaded in 2.834 s. on an old database and loads in 0.920 s. on a new database, given that in both cases, the application has an empty cache, it's a win: the new database improved your site performance concerning this query.

Now, as any performance metric, it's biased:

Agreed, the new query is faster. But wait, your DBA didn't know how to use the database you had before, so the query which loads all the products is not optimized. If you rewrite it like that, you would be able to load those products in 0.855 s. instead of 2.834.
Ok, you have a better result. But don't you think that it's unfair to compare a database with fresh data just flushed to a 10 years old database for which the last maintenance plan was run three years ago? By the way, don't you think you should have updated the database product at least once during the last four years?
Some queries are faster. Some are slower. How do you compute the average result to know that you gained performance overall when moving to the new database? Ok, the time you load all 3 182 432 products is faster. But does it matter, while the query is executed on the website only in a rare case when an administrator is performing some specific task he performed only two times in the last ten years? On the other hand, executing all queries on the home page for a fresh user wastes 0.281 s. with the new database, when it was 0.207 s. with the old database. This result matters much more, especially since those queries cannot be cached for a long time, and are executed tens of thousands of times per day.
Both databases must be tested on the same servers, same hardware, same structure. For example, you can't test one database on a single hard drive, and the other one on a RAID1 of two SSDs. When you migrate a large project to a new database, there are chances that you'll just host the new database on hundred other newly deployed rack servers, when the previous database will still remain on the previous machines.

To summarize, you can benchmark the database queries of an application, and obtain precise metrics. But then, you have to give a meaning to numbers. At this state, it's tempting to say that you gained site performance: otherwise, management would be angry to learn that you've spent thousands of dollars and months of work just to make things slower.

The most terrible mistake is to take those conclusions from the benchmarks and to conclude some stupidity like "Microsoft SQL Server is three times faster than Oracle": saying this is like saying that "Java is better than PHP". Define better. Better in what cases? For what sort of applications? For what team of developers?

More you interpret and generalize, more the thing becomes irrelevant and meaningless.

The query select [...] you can find in the revision #832 in file ProductFactory.cs, line 117 executes under 0.5 s. with the new database when tested under the conditions specified in the non-functional requirements annex M, case 3. This allows passing the non-functional requirement 527 (see page 80, revision 9). The same requirement was not satisfied with the previous database, when the test results was in the range 0.9..1.3 s. in the same conditions.

is meaningful for a developer, and precise enough to know what was tested, how, and what was the results. This answers your question number 2.

Sadly, it doesn't make any sense for the management. Instead:

Migrating our product from MySQL to the newest version of Microsoft SQL Server improved the overall performance of our product by five, reducing at the same time the costs by two and the environmental footprint by three. We believe that migrating all our applications to Microsoft SQL Server the next year will give even better results and increase our market competitiveness.

is a pure marketing jibber-jabber, and, technically, doesn't mean anything, but surprisingly has a value for the management and marketing departments.

Finally, can we compare different types of databases? I would say it's totally possible. Let's say I have a website hosting large photos. Those photos are stored in varbinary(max) in Microsoft SQL Server 2005 (so I can't use filestream). I am concerned about the performance when loading those photos, so I decide to store the photos as files instead, using the file system as my new database. First, those files are stored on the same machine than the database. I profile the new solution, and obtain the result that shows that in my case, files are loaded 4% faster from file system than from Microsoft SQL Server. The benchmark is very clear. Now I can think about deploying a dedicated server optimized for direct file storage, rather than using the server optimized for Microsoft SQL Server.

score 2 · Answer 2 · answered Jan 01 '12 at 15:21

With all the money at stake with the major database companies and the large group of developers on open source db apps, if there was a way to do it, they would have figured it out by now (And blasted the results all over the internet.).
I wouldn't. Instead, create specific benchmarks for specific needs and environments.

At some point, the amount of money available and the designer's expertise with a particular database may determine the limitations more than anything. A good Oracle dba will out perform most junior developers regardless of what platform they choose.

score 1 · Answer 3 · answered Dec 31 '11 at 08:10

No, the differences between them are such that any one benchmark would be biased.

That said, developing a site like Computer Language Benchmarks Game, which includes a broad range of tests and makes it easy to compare tests (either specific tests language-to-language, or composites of many languages), would be of some benefit (at least in my eyes), especially if it was set up so the community could submit solutions and improve any short-comings in schemas or queries.

In the case of the DB benchmark site, instead of implementing algorithms (as in the case of the language shootout), tests could consist of raw data that has to be stored and then retrieved according specific constraints. For instance, maybe there's a set of raw data that contains information representing a simple schema representative of what a community library may use to track patrons and books. Each DB must store all 1 million records and then retrieve some sub-sets of the data that meet the constraints. Then, there could also be a data set that represents some very simple structure/relationship (maybe a comment system typically used for sites like ESPN, etc.) that contains 100 million records, and it has its own set of queries that must be performed. Etc.

Testing DBs on a broad range data sets (ranging from complex to simple relationships, small sets to humongous) could prove very helpful, as you'd at least be able to see general tendencies for data that bears similar qualities to the project you're currently assessing.

Matthias · Answer 4 · 2012-01-02T01:56:33.550

I would like to add few more reasons, why you can't benchmark all types of databases.

There are two major directions of database systems: OLAP and OLTP (see comparison).
As you said, there are also relational and document-oriented database systems. While RDBS strictly follow the ACID principle, in most document-oriented DBS you can decide that weak data is sufficient enough for your application. That makes locking and scheduling much easier.

In a short: You wouldn't argue, that a Lamborghini is the best car in the world. Think of volume of trunk, number of seats, or mileage.

As a side note: Here is a benchmark for OLTP databases systems.

Benchmarking databases

4 Answers4

Short answer

Long answer

Linked