Is it safe to rely on static analysis to "reproduce" concurrency bugs reliably?

Question

I've inherited some Java code which I suspect harbours some concurrency bugs when synchronizing between a thread that queries data and an IO event that updates the same data. I'm trialling a static analysis tool called ThreadSafe which indeed detects various concurrency issues (i.e. field accessed from asynchronously invoked method without synchronization and Inconsistent synchronization of accesses to a collection).

After discovering how difficult it is to reproduce data races reliably by writing unit tests, I wondered if it's advisable to depend on ThreadSafe to "reproduce" the bugs reliably? What I'm asking is, is it safe to depend on the static analysis tool to tell me when I've fixed the bugs? (Of course, to fix the bugs I will need to understand each one in context).

So, in other words, your question is "does static analysis have false negatives". — , Jun 30 '14 at 12:04
@delnan: EXTREMELY cogent, perceptive comment. +100 if I could. — John R. Strohm, Jun 30 '14 at 12:38

GraniteRobert · Answer 1 · 2014-06-30T14:11:25.450

In my experience, a determined, well-meaning, but clueless developer can wrap enough obfuscation around a concurrency bug that the analysis tool will miss it.

If the tool thinks there's a bug, assume it's right until you prove otherwise. That's the value of the tool. If the tool doesn't find any bugs, your best bet is to pretend the tool doesn't exist.

If you want reliable concurrent code (in any language), your goals should include:

The critical sections should be tiny and simple so they are trivial to understand.
The critical sections should be completely isolated from other code at the source level.
A function/method/subroutine involved in concurrent code should do one thing only. If it manipulates a synchronization primitive (lock, semaphore, etc.) it should NOT fiddle with any of the shared resources and it should NOT contain loops or conditional code. Move the "real work" to a separate function. This supports item 1.
If you have any kind of lock, a developer should be able to read the code and determine exactly what resource(s) that lock protects in 30 seconds or less. If it's not obvious to you, it won't be obvious to the next guy. And it probably wasn't clear to the fellow who wrote the code. So it's probably broken.
The visibility of the resource you're protecting should be no greater than the visibility of the synchronization primitive that protects it. Neither component should be visible to the larger body of code.
Etc. The rest of the list is really just a variety of ways of restating items 1 and 2.

If your code is designed and organized right, you should be able to look at a tiny fraction of the code to understand and trust the concurrency. When you reach that goal, then feed the code to the static analysis tool and see if it agrees.

And give the code to a few other non-guru developers for review. Ask them to explain how the concurrency works and why it's correct. If they can't explain it, it's still too complex.

something to add to 4: each resource protected by a lock must be documented that it is protected by the lock and must never be used away from where the lock is acquired — ratchet freak, Jun 30 '14 at 13:28
Something to add: 6) NEVER acquire and hold more than one lock at any given instant. 7) If you absolutely, positively MUST acquire and hold more than one lock at any given instant, make damned certain that they are ALWAYS acquired in the SAME order. If process A and process B must both acquire locks Red and Green, they can deadlock if one of the acquires Red first and the other acquires Green first. — John R. Strohm, Jun 30 '14 at 15:43

score 3 · Accepted Answer · answered Jun 30 '14 at 12:47

Answering your first question:

http://www.contemplateltd.com/threadsafe/faq#non-checkers

"What kinds of concurrency defects does ThreadSafe not look for?

We do not yet include support for all of java.util.concurrent. Specifically, ThreadSafe does not check for misuse of the advanced synchronization facilities provided by java.util.concurrent, for example the java.util.concurrent.atomic package. Nor does it currently include any checkers for mistakes that may occur when writing parallel programs using the Fork-Join framework.
Finding bugs in intricate lock-free algorithms requires analysis technology that does not scale well. Such algorithms should be left to specialized concurrency experts. ThreadSafe can be configured to find defects in applications that use libraries containing such algorithms.
Which analyses we include in new releases of ThreadSafe is dictated both by what is possible using analysis technology that is mature enough for integration into a usable product, and by the features and concurrency defects that we see often in ‘average’ Java programs.
If you have a Java concurrency problem that is not currently covered by ThreadSafe, Contemplate’s Java concurrency specialists may be able to help by adding custom configuration to ThreadSafe and/or by use of advanced analysis technology that is not yet mature enough for integration into ThreadSafe."

Note to self: RTFM! This answers my question. – doughgle Jul 01 '14 at 01:32 — doughgle, Jul 01 '14 at 01:32

score 2 · Answer 3 · answered Jul 06 '14 at 12:51

Producing a useful static analysis tool involves balancing a number of conflicting concerns, including at least the following:

False positive (false alarm) rate
False negative (uncaught bug) rate
Run time and scalability

False positives are a critical concern with bug detection tools since they waste developer time. It's possible to eliminate false negatives, but for many types of bugs, including concurrency bugs, this would involve an unacceptable increase in the false positive rate. The rate of both false positives and false negatives can be decreased at the cost of increased run time, but the most accurate analysis techniques don't scale beyond small applications, and anyway they can't both be decreased to zero.

Dynamic analysis tools often claim a false positive rate of 0%, but that's because they only find bugs once they actually happen. For a race condition or deadlock that only happens once in a blue moon, that's not so useful.

For these reasons, ThreadSafe doesn't promise to find all concurrency bugs - it aims to find the most important ones, with a low false positive rate. Some users have tried ThreadSafe on code with a known concurrency bug that they spent days to find, and discover that it finds that bug - and often other genuine bugs that they didn't know about - with no false positives, within minutes.

A good place to start for information about ThreadSafe is this InfoQ article. For more, see the ThreadSafe website where you can sign up for a free trial.

(Disclosure: ThreadSafe is a commercial tool, and I'm co-founder of Contemplate, the company that produces it.)

Is it safe to rely on static analysis to "reproduce" concurrency bugs reliably?

3 Answers3