How to identify unstable floating point computations?

Question

In numerics, it is very important to be able to identify unstable schemes and to improve their stability. How to identify unstable floating point computations?

I am working on a very complex simulation where many numerical schemes work together and I am looking for a methodic to identify its weak parts. I am working on a physical model involving differential equations. A bird's eye view of the overall process is:

(Preliminary step) Gather physical observations P.
Determine initial parameters of the simulation. This uses an optimisation algorithm, where we walk in a parameter space and look for parameters C such that some error function E(F(C), P) is minimised, where F is some derived quantity of the parameters.
Plug C in the simulation engine. This is an Euler scheme of the EDP, so that at each time step, we compute the terms driving the dynamic (each of them is a complex function, potentially subject to instability) and feed the Euler scheme with these dynamic terms to compute the next state. This goes on for thousands time points.
At the end of the simulation, we compute some function Proof(S) of the final state S and compare to some quantities Require(P) deduced from the observed quantities. This is not a formal proof of the result, more a plausibility check.

Also, I see a tower of complex operations (computation of dynamic terms, within the Euler scheme, within the Proof). And would like to recognise “bad parts” and fix them.

I speculate that using a software implementation of floating point numbers with reduced precision would magnify the unstability of numerical schemes, thus easing the comparison between different implementations. Is this a common technique to investigate this question? Is it possible to use a virtual machine, as Bochs, to achieve this without altering the program?

To deal appropriately with the stability question, it is sometimes acceptable to target the typical input of the numerical procedure, so that it can be tuned to do well on that input and maybe less well on other valid, yet unlikely, input. Given a sample of typical inputs, it is possible to snoop some intermediate results and prepare a statistical profile for them. Again, is this a common technique to study stability issues? Is a virtual machine useful for this?

may be you will have more interesting answers on http://math.stackexchange.com/ — Simon Bergot, Dec 11 '13 at 09:54
@Simon You might be right, but this is definitely a cross-domain question. I guess persons able to answer are wether registered to both math and programmers or to none… Let's wait a bit an see if this question finds its answer here! — user40989, Dec 11 '13 at 11:33
A custom numeric type that keeps track of the error in C++ should work decently. Of course that requires some changes to the program. — CodesInChaos, Dec 11 '13 at 14:01
@CodesInChaos Do you mean having a numeric type working on both FP arithmetic and exact arithmetic, and logging huge differences? It is a neat idea, but I several difficulties: how to identify numbers (in the code)? How to deal with transcendental functions (exp, log, ...) which cannot be exactly computed? — user40989, Dec 11 '13 at 14:11
I meant a custom type that keeps track of upper and lower bounds by using different rounding. This doesn't work in every situation (In particular non continuous operations like `x > y` don't work), but should work well in many cases. — CodesInChaos, Dec 11 '13 at 14:15
@SK-logic Interval arithmetic looks interesting. Do you have examples or hints on how to use this to investigate properties of an existing floating-point program? — user40989, Dec 12 '13 at 11:50
Well, a constantly growing interval indicates that a numeric method is not stable (i.e., a small variation leads to accumulating an error). Detecting such conditions is what interval arithmetics was designed for. — SK-logic, Dec 12 '13 at 12:07
What about some injection style code that could comprehend mathematical operations, and then perform a reverse calculation? Testing the reverse of a single operation should return you to the original value, and the greater the difference, the greater the precision interfered with the calculation. Does this sound plausible? Are your operations reversible? — Kieveli, Dec 12 '13 at 12:48
What is “injection style code”? Your idea is interesting but here not that practical because I deal with real functions of a large number of real arguments. Maybe I could use something like numerical partial derivatives, computing ULP differences between f(x1, x2, ...) and f(x1 + epsilon, x2, ...) where epsilon changes a few ULP of x1. — user40989, Dec 12 '13 at 13:31
@Kieveli, you will *almost* *never* get the original value by reversing floating point operations, and it does not necessarily mean the calculation is not stable. — SK-logic, Dec 12 '13 at 14:15
@SK-logic, stable converging iterating methods commonly result in growing intervals with interval arithmetic. — AProgrammer, Dec 13 '13 at 14:16
Would it be possible to expand your question with info on the domain of the solution. And also describe the algoritms involved to some extend. As mentioned by others, the methods applicable depends greatly on the domain and type of algorithms. — KlausCPH, Dec 20 '13 at 10:28
@KlausCPH I added a few more details, I hope they are useful! — user40989, Dec 20 '13 at 10:50
Using Euler to propagate state isn't necessarily evil; neither is optimization, but you really have to split the problem into subtasks. Numerical instability may be the least of your woes - convergence to a false maximum and problems related to the stiffness of the ODEs/PDEs loom larger than that. And yes, never ever use single precision :) — Deer Hunter, Dec 20 '13 at 11:21

score 6 · Accepted Answer · answered Dec 13 '13 at 14:41

The study of stability of floating point computation is part of numerical analysis and if you really want a sound result, you really want someone knowledgeable in that domain to do the analysis of the algorithms used.

There are some things which can help to experimentally identify unstable algorithms. Running with rounding set to different modes (up/down/random) or with different precision and checking that the result don't vary too much. Answering is this too much? isn't simple at all and even when the answer is no that doesn't mean the algorithm is stable, just that it wasn't detected unstable on the data set you used.

Interval arithmetic has been proposed in comments. When I looked at it, even the most rabid proponent of interval arithmetic admitted that it worked well with algorithms designed for interval arithmetic but that switching to it without analyzing the algorithm and ensuring that it hadn't patterns known not to work well wouldn't be useful (opponents seemed of the opinion that the pre-conditions for interval arithmetic to be useful where too restrictive to be of practical interest)

score 3 · Answer 2 · answered Dec 17 '13 at 18:25

Designing stable floating point algorithms is highly nontrivial. Those more mathematically skilled than myself suggest using well regarded libraries where possible rather than attempting to roll your own. The standard reference in the area seems to be:

N. J. Higham. Accuracy and Stability of Numerical Algorithms. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, second edition, 2002. ISBN 0-89871-521-0

Not knowing more about what types of computations, languages, etc. makes it hard to give much of a concrete answer. There is a good lecture here: http://introcs.cs.princeton.edu/java/91float/ this may be a little basic, but it's a good introduction if you are using java.

score 1 · Answer 3 · answered Dec 11 '13 at 21:08

How to identify unstable floating point computations? Is this a common technique to investigate this question?

I think unless you need to show some statistics on error, you don't really need to gather samples. What you need is Numerical Analysis, which also falls under the subjects of Numerical Methods, Numerical Linear Algebra, etc. And they're part of computer science, so you might get some answers too in cs.stackexchange.

Anyhow, in general programming, most problems are easy to spot given some basic understanding about how floating point works and basic numerical methods. But even more complex problem are "easier" to solve today with availability of 128-bit floats, even less reason to produce error samples. Here are some example problems to show my point:

using floating point for calculating monetary values.
using floating point for big numbers.
not doing divisions before other operations when it's possible to do so. (to make value closer to 0).
long calculation without special treatment for error propagation.

There's also example of a naive algorithm and error compensated algorithm here algorithm for calculating variance. In the example, looking at the naive version, you just kind of can smell that doing calculation in loops will carry some errors and is not being compensated.

Thank you for your answer, I am however looking for more detailed information. I have a very large computation and want to identifiy its weak parts. I edited the question accordingly. — user40989, Dec 12 '13 at 09:54
I'm not exactly sure what's your situation when you say you have large computation and want to identify weak parts. Numeric calculations inherently have errors in them, even one simple add operation. So, unless your large computation is error compensated, then they're as a whole need to be fixed. Improving weaker spots might not be good enough. If you now the "epsilon" of your floating point model, a simple analysis will show how big the error can be as they propagate through the long computation. — imel96, Dec 14 '13 at 22:33

score 0 · Answer 4 · answered Dec 12 '13 at 11:17

0

You can avoid numerical errors by using appropriate data types (like, for example, continued fractions). If you need or want to use floating point arithmetic, you need to apply numerical know-how to know the errors.

answered Dec 12 '13 at 11:17

Ingo

3,903
18
23

I do not want to avoid numerical errors, I want to find which parts of the computation are unstable. It is similar to localising speed bottlenecks when optimising for speed. So I want to optimise precision and therefore wants to find precision bottlenecks. (Continued fractions are not useful here.) – user40989 Dec 12 '13 at 11:32
1

@user40989, then you definitely need interval arithmetics. – SK-logic Dec 12 '13 at 11:44

How to identify unstable floating point computations?

4 Answers4