228

Frequently, in my programming experience, I need to make a decision whether I should use float or double for my real numbers. Sometimes I go for float, sometimes I go for double, but really this feels more subjective. If I would be confronted to defend my decision, I would probably not give sound reasons.

When do you use float and when do you use double? Do you always use double, only when memory constraints are present you go for float? Or you always use float unless the precision requirement requires you to use double? Are there some substantial differences regarding the computational complexity of basic arithmetics between float and double? What are the pros and cons of using float or double? And have you even used long double?

Milan
  • 117
  • 1
  • 4
Jakub Zaverka
  • 2,429
  • 2
  • 14
  • 8
  • 30
    In many cases you want to use neither, but rather a decimal floating or fixedpoint type. Binary floating point types can't represent most decimals exactly. – CodesInChaos Feb 28 '13 at 11:20
  • 5
    Related to [What causes floating point rounding errors?](http://programmers.stackexchange.com/q/101163/22493). @CodesInChaos [my answer](http://programmers.stackexchange.com/a/101197/22493) there suggests resources to help you make that determination, there is no *one-size-fits-all* solution. – Mark Booth Feb 28 '13 at 13:26
  • Very good answer found at: [Stack Overflow](http://stackoverflow.com/questions/407970/when-to-use-a-float) – Haris Feb 28 '13 at 13:36
  • For decimals I would use neither. I would use an integer and store the value multiplies by 100. – Martin York Feb 28 '13 at 15:35
  • 5
    What exactly do you mean by "decimals". If you need to represent values like 0.01 exactly (say, for money), then (binary) floating-point is not the answer. If you merely means non-integer numbers, then floating-point is likely ok -- but then "decimals" is not the best word to describe what you need. – Keith Thompson Feb 28 '13 at 16:22
  • @Keith I mean just like when one needs to store a floating-point number. Doesn't necessarily to be decimal - it can also be sound or image data, for example. – Jakub Zaverka Feb 28 '13 at 18:51
  • @JakubZaverka: I've edited your question to refer to "real numbers" rather than "decimals". – Keith Thompson Feb 28 '13 at 18:54
  • 1
    Considering (as of today) most graphics cards accept floats over doubles, graphics programming often uses single precision. – Thomas Eding Aug 19 '14 at 17:01
  • 1
    You don't always have a choice. For example, on the Arduino platform, both double and float equate to float. You need to find an add-in library to handle real doubles. – kiwiron Apr 29 '16 at 05:22

8 Answers8

216

The default choice for a floating-point type should be double. This is also the type that you get with floating-point literals without a suffix or (in C) standard functions that operate on floating point numbers (e.g. exp, sin, etc.).

float should only be used if you need to operate on a lot of floating-point numbers (think in the order of thousands or more) and analysis of the algorithm has shown that the reduced range and accuracy don't pose a problem.

long double can be used if you need more range or accuracy than double, and if it provides this on your target platform.

In summary, float and long double should be reserved for use by the specialists, with double for "every-day" use.

Bart van Ingen Schenau
  • 71,712
  • 20
  • 110
  • 179
  • 13
    I would probably not consider float for a few thousand values unless there were a performance problem related to floating point caching and data transfer. There is usually a substantial cost to doing the analysis to show that float is precise enough. – Patricia Shanahan Feb 28 '13 at 15:35
  • 7
    As an addendum, if you need compatibility with other systems, it can be advantageous to use the same data types. – zzzzBov Feb 28 '13 at 16:30
  • 17
    I'd use floats for millions of numbers, not 1000s. Also, some GPUs do better with floats, in that specialized case use floats. Else, as you say, use doubles. – user949300 Aug 19 '14 at 16:57
  • 4
    @PatriciaShanahan - 'performance problem related to..' A good example is if you are planning to use SSE2 or similar vector instructions, you can do 4 ops/vector in float (vs 2 per double) which can give a significant speed improvement (half as many ops and half as much data to read & write). This can significantly lower the threshold where using floats becomes attractive, and worth the trouble to sort out the numeric issues. – greggo Sep 09 '14 at 19:03
  • 16
    I endorse this answer with one additional advice: When one is operating with RGB values for display, it is acceptable to use `float` (and occasionally half-precision) because neither the human eye, the display, or the color system has that many bits of precision. This advice is applicable for say OpenGL etc. This additional advice does not apply to medical images, which have more strict precision requirements. – rwong Nov 17 '14 at 22:00
  • It should be noted that ops on long double are often extremely slow, as much as 5x, because they generally have to be implemented in software – raptortech97 Feb 07 '15 at 13:20
  • IMHO it all depends on the application and the target of accepted accuracy. For example: if one is working on a 3D Graphics Rendering Engine where you are operating on millions or billions of vertices in a 3D Cartesian graph, then it is more efficient to use `float` with some loss of precision for performance gain. On the other hand if precision is of a higher importance such as in an application that works with sub atomic particles or astronomy type situations and the efficiency of performance is not as high of a priority then by all means use a `double`. – Francis Cugler Dec 31 '17 at 07:33
  • ... Also the effects of caching should play a large role in the decision to choose one type over another. If the order of magnitude of floating point arithmetic operations is low, then double should be fine for both memory consumption, caching and the speed time complexity of the function - operation, once the magnitude of operations exceeds a limit that is dependent on the architecture, OS, language - compiler then float should be considered. There is always a tradeoff between one or the other. – Francis Cugler Dec 31 '17 at 07:42
  • @user949300 which GPUs do better with `float`s? Also, by any chance do you know why? Thank you in advance! – Milan Nov 19 '21 at 20:35
  • @rwong Images have pixel locations in integer format and so do the pixel values, right? Higher resolution images have more pixels but their locations would still be in integer. Same way, pixel values can have an 8-bit resolution, 16-bit, or more but that still be in integer values, correct? Please do correct me if I'm mistaken here. Thank you in advance! – Milan Nov 19 '21 at 20:41
  • 1
    @Milan shader languages such as OpenGL GLSL often use float types for color representation and corrdinate systems, e.g. for vertex- and geometry shaders. Many GPUs provide hardware support for that. – Hulk Nov 23 '21 at 16:40
  • Just like to add that another reason to use floats is when you are working with a library that uses floats extensively. The big example here being OpenCV. OpenCV defines things like its convex hull routine where the float point/double point distinction does not even effect the algorithm only for float points. – jwezorek Aug 27 '22 at 22:56
48

There is rarely cause to use float instead of double in code targeting modern computers. The extra precision reduces (but does not eliminate) the chance of rounding errors or other imprecision causing problems.

The main reasons I can think of to use float are:

  1. You are storing large arrays of numbers and need to reduce your program's memory consumption.
  2. You are targeting a system that doesn't natively support double-precision floating point. Until recently, many graphics cards only supported single precision floating points. I'm sure there are plenty of low-power and embedded processors that have limited floating point support too.
  3. You are targeting hardware where single-precision is faster than double-precision, and your application makes heavy use of floating point arithmetic. On modern Intel CPUs I believe all floating point calculations are done in double precision, so you don't gain anything here.
  4. You are doing low-level optimization, for example using special CPU instructions that operate on multiple numbers at a time.

So, basically, double is the way to go unless you have hardware limitations or unless analysis has shown that storing double precision numbers is contributing significantly to memory usage.

Tim Armstrong
  • 626
  • 4
  • 3
  • 3
    "Modern computers" meaning Intel x86 processors. Some of the machines the Ancients used provided perfectly adequate precision with the basic float type. (The CDC 6600 used a 60-bit word, 48 bits of normalized floating-point mantissa, 12 bits of exponent. That's ALMOST what the x86 gives you for double precision.) – John R. Strohm Aug 19 '14 at 17:03
  • @John.R.Strohm: agreed, but C compilers did not exist on CDC6600. It was Fortran IV... – Basile Starynkevitch Aug 19 '14 at 20:41
  • 2
    By "modern computers" I mean any processor built in the last decade or two, or really, since the IEEE floating point standard was widely implemented. I'm perfectly aware that non-x86 architectures exist and had that in mind with my answer - I mentioned GPUs and embedded processors, which are typically not x86. – Tim Armstrong Jan 28 '15 at 21:43
  • 1
    That's simply not true, though. SSE2 can manipulate 4 floats or 2 doubles in one operation, AVX can manipulate 8 floats or 4 doubles, AVX-512 can manipulate 16 floats or 8 doubles. For any kind of high performance computing, math on floats should be thought of as twice the speed of the same operations on doubles on x86. – Larry Gritz Sep 20 '16 at 18:19
  • 2
    And it's even worse than that, since you can fit twice as many floats in processor cache as you can with doubles, and memory latency is likely to be the main bottleneck in many programs. Keeping a whole working set of floats warm in cache may be literally an order of magnitude faster than using doubles and having them spill to RAM. – Larry Gritz Sep 20 '16 at 18:20
12

Use double for all your calculations and temp variables. Use float when you need to maintain an array of numbers - float[] (if precision is sufficient), and you are dealing with over tens of thousands of float numbers.

Many/most math functions or operators convert/return double, and you don't want to cast the numbers back to float for any intermediate steps.

E.g. If you have an input of 100,000 numbers from a file or a stream and need to sort them, put the numbers in a float[].

Trang Oul
  • 105
  • 6
Fai Ng
  • 221
  • 1
  • 4
5

Some platforms (ARM Cortex-M2, Cortex-M4 etc) don't support double (It can always be checked in the reference manual to your processor. If there is no compilation warnings or errors, it does not mean that code is optimal. double can be emulated.). That is why you may need to stick to int or float.

If that is not the case, I would use double.

You can check the famous article by D. Goldberg ("What Every Computer Scientist Should Know About Floating-Point Arithmetic"). You should think twice before using floating-point arithmetic. There is a pretty big chance they are not needed at all in your particular situation.

http://perso.ens-lyon.fr/jean-michel.muller/goldberg.pdf

staroselskii
  • 161
  • 1
  • 3
  • 3
    This question was already pretty well answered a year ago... but in any case, I'd say any time you're using double on platforms with double precision FPU acceleration, you should be using it on any other, even if that means letting the compiler emulate it instead of taking advantage of a FPU with floating-point only (note that FPU's aren't required on all platforms either, in fact a Cortex-M4 architecture defines them as an optional feature [was M2 a typo?]). – Selali Adobor Sep 22 '14 at 23:23
  • The key to that logic is, while it's true one should be weary of floating point arithmetic, and it's many "quirks", definitely not taking the presence of FPU support for doubles to mean simply use doubles instead of floats. Floats are very generally faster than doubles and take less memory (FPU features vary). The volume of usage precludes this point from being on premature optimization. As does the fact doubles are clearly overkill for a lot (maybe even most) applications. Do the elements on this page really need to have their relative positions and sizes calculated to _13_ decimal places? – Selali Adobor Sep 22 '14 at 23:36
  • 2
    When including a link to an off site page or document, please copy the relevant information, or summary, from the document into your answer. Off site links have a tendency to disappear over time. – Adam Zuckerman Sep 23 '14 at 00:10
3

For real world problems the sampling threshold of your data is important when answering this question. Similarly, the noise floor is also important. If either is exceeded by your data type selection, no benefit will come from increasing precision.

Most real world samplers are limited to 24 bit DAC s. Suggesting that 32 bits of precision on real world calculations should be adequate where the significand is 24 bits of precision.

Double precision comes at the cost of 2x memory. Therefore limiting the use of doubles over floats could drastically cut the memory footprint/bandwidth of running applications.

2

A very simple rule: You use double unless you, personally, can give reasons that you can defend, why you would use float.

Consequently, if you ask “should I use double or float”, the answer is “use double”.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
-3

The choice of what variable to use between float and double depends on the accuracy of the data required. If an answer is required to have negligible difference from the actual answer, number of decimal places required will be many thus will dictate that double to be in use.Float will chop off some decimal places part thus reducing the accuracy.

-5

Usually, I use the float type when I don't need much precision — for example, for money — which is wrong, but is what I'm used to wrongly do.

On the other hand, I use double when I need more precision, for example for complex mathematical algorithms.

The C99 standard says this:

There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double.

I never really used long double, but I don't use C/C++ so much. Usually I use dynamically typed languages like Python, where you don't have to care about the types.

For further information about Double vs Float, see this question at SO.

  • 28
    Using floating point for serious money calculations is probably a mistake. – Bart van Ingen Schenau Feb 28 '13 at 10:53
  • @BartvanIngenSchenau, gosh, I thought so, but that's what I've been always doing. – Addison Montgomery Feb 28 '13 at 10:54
  • 19
    float is exactly the wrong type for money. You need to be using the highest precision possible. – ChrisF Feb 28 '13 at 10:56
  • 9
    @BartvanIngenSchenau Floating point for money is usually okay, *binary* floating point is not. For example .net's `Decimal` is a floating point type and it's typically a good choice for money calculations. – CodesInChaos Feb 28 '13 at 11:21
  • @CodesInChaos Where did you find the statement that the .Net `Decimal` type is floating point? – GalacticCowboy Feb 28 '13 at 16:35
  • 1
    `Decimal` represents numbers of the form `m*10^(-e)` where `e` is between 0 and 28. ["The binary representation of a Decimal number consists of a 1-bit sign, a 96-bit integer number, and a scaling factor used to divide the integer number and specify what portion of it is a decimal fraction. The scaling factor is implicitly the number 10 raised to an exponent ranging from 0 to 28."](http://msdn.microsoft.com/en-us/library/bb1c1a6x.aspx) – CodesInChaos Feb 28 '13 at 16:39
  • 1
    However, based on those facts, one can conclude that it's not floating point *in the same sense as we normally use those words* -it's fixed point, except that the scaling factor is a member of the type instead of being defined *by* the type. – GalacticCowboy Feb 28 '13 at 17:27
  • 15
    @ChrisF You don't need "high precision" for money, you need exact values. – Sean McSomething Feb 28 '13 at 19:37
  • 2
    @SeanMcSomething - Fair point. However, floats are still the wrong type though and given the floating point types available in most languages you need "high precision" to get "exact values". – ChrisF Mar 01 '13 at 08:38
  • 1
    It really depends what the program is for. If the program is used to track personal finance, make simple predictions over time, demonstrate compound interest or the like, single-precision floats are unlikely to cause problems. One could also imagine making a personal ledger using only integers (one for dollars, one for cents). I'm not saying float is the *best* way to represent money, but it's totally fine for certain instances. – jvriesem Feb 18 '20 at 18:42
  • @jvriesem At that point, just skip the separate dollars integer and store your dollars as part of the cents. It is somewhere between "unlikely" and "impossible" that you'll deal with amounts where the choice to store dollars as hundreds of cents instead of as dollars makes the difference that overflows your largest available integer type. And if it does, time to embrace arbitrary precision math libraries that manage numbers represented as arrays of integers or whatever. – mtraceur Feb 24 '21 at 07:28
  • I think I disagree with @GalacticCowboy's last comment: that description of the decimal type's representation sounds exactly like the base-10 equivalent of how normal floating point types are internally represented. – mtraceur Feb 24 '21 at 07:31