Are there ways to unit test your multi-threaded code for race conditions and deadlocks?
To see if they are performing the way they should be...
Are there ways to unit test your multi-threaded code for race conditions and deadlocks?
To see if they are performing the way they should be...
CHESS, a project of Microsoft Research. Quoting their site:
CHESS is a tool for finding and reproducing Heisenbugs in concurrent programs. CHESS repeatedly runs a concurrent test ensuring that every run takes a different interleaving. If an interleaving results in an error, CHESS can reproduce the interleaving for improved debugging. CHESS is available for both managed and native programs.
Update (9/23/2015): For C, C++, and Go, you can use ThreadSanitizer.
Valgrind has Helgrind which really helps. Not only does it help point out races that could lead to starvation or deadlock, the slight slowdown of having the program profiled sometimes exposes races that might not be seen otherwise.
So, even if you go commando with some kind of lock free method, it still helps :)
It is POSIX centric, however. It ships with headers that easily make simple unit test libraries like TAP aware that it is running, which is also really helpful. For instance, you could have a thread that normally would not block when trying to acquire a lock go ahead and block (perhaps randomly), just to simulate starvation.
I don't remember the details exactly, but this is the general idea. And I've only done it once, but what I did was separate the re-entrant code from the code performing the task, using an interface to be able to mock the task class.
Then I designed my mock-up to be able to lock on a call so that I know the thread is in the critical section, and then call it again and verify that it is waiting, before releasing the first thread and finish cleanly.
Something like that.
I'm not sure that would work for more complex scenarios, but it helps preserve the behaviour during refactorings.
At JAOO/GOTO this year I saw this presentation:
The trick is that you model what your hairball application is supposed to DO, in terms of invocation steps as well as the actual operations on your application. John Hughes software then systematically tries many permutations of invocation steps repeatedly in parallel and checks afterwards that the state of the application corresponds with the state of the model. If an error is found, the software knows how to reduce the steps to the minimal case producing the error.
He demonstrated live how to catch several bugs in core Erlang libraries which had been lurking for 15 years and occasionally reported but nobody could figure out where they came from and hence how to fix. With the minimal cases reported by the software, the library maintainer was able to fix each bug within a day.
It was SO impressive.
John Hughes sells this software through his company.
You may try my Relacy Race Detector. It's designed to carefully and precisely verify synchronization algorithms like producer-consumer queues and concurrent containers, but not very suitable for verification of whole programs. However, perhaps it's a good idea to spread synchronization and mutexes allover a program anyway, but instead concentrate synchronization in specialized components (that can be verified with Relacy).
Not a strict unit test, but a runtime check that has helped me with some intermittently failing tests. It's quick & dirty, but it worked.
When a mutex is granted I keep a track of which thread has it. All mutex requests have a thirty-second timeout, after which they scream deadlock.
I can then use the list of granted mutexes to see which thread is holding the blocking mutex, and why for so long. In my cases so far it's because it was deadlocked on something else, so I can then fix that situation.
This worked for me because my mutexes have a cross-platform wrapper class making it easy to inject the record keeping and timeout. I also knew enough about the application to know it should never be blocking on a mutex for 30 seconds.
It might not be completely general purpose, but it saves a lot of debugging for about a couple of hour's programming effort. The overhead is negligible, and it can be debug-build only.
I'm going to look at extending it to record nested mutex request sequences, and see if any are potentially deadlock inducing (eg. one thread locks A then B and another locks B then A) rather than just actually deadlock inducing, but so far it's been a lot of benefit for trivial effort.
It's not easy, but basically the only way is to call the multi-threaded code concurrently from multiple threads and change timing and ordering randomly by playing with random Thread.sleep()
and Thread.yield()
calls (assuming Java).
There are also ready tools available (such as TestNG) that do something like described above, but they're not very mature yet, as far as I know.