Why are floats still part of the Java language when doubles are mostly recommended instead?

Question

In every place I've looked, it says that double is superior to float in almost every way. float has been made obsolete by double in Java, so why is it still used?

I program a lot with Libgdx, and they force you to use float (deltaTime, etc.), but it seems to me that double is just easier to work with in terms of storage and memory.

I also read When do you use float and when do you use double, but if float is really only good for numbers with a lot of digits after the decimal point, then why can't we just use one of the many variations of double?

Is there any reason as to why people insist on using floats even though it doesn't really have any advantages anymore? Is it just too much work to change it all?

Possible duplicate of [When do you use float and when do you use double](http://programmers.stackexchange.com/questions/188721/when-do-you-use-float-and-when-do-you-use-double) — Vincent Savard, Apr 26 '16 at 16:10
This should help... -http://stackoverflow.com/questions/27598078/float-and-double-datatype-in-java — Jon Raynor, Apr 26 '16 at 16:32
How in the world did you infer "float is really only good for numbers with a lot of digits after the decimal point" from the answers to that question?! They say the *direct opposite*! — Ordous, Apr 26 '16 at 16:48
@Ordous " float should only be used if you need to operate on a lot of floating-point numbers (think in the order of thousands or more)" — Eames, Apr 26 '16 at 17:09
@Eames Note how it says "numbers", not "digits". Floats are *worse* when you need precision or range, they are *better* when you need lots and lots of not-so-precise data. That's what those answers say. — Ordous, Apr 26 '16 at 17:22
I note you are comparing `floats` to `Doubles`. Is that deliberate? As a `double` isn't the same as a `Double` — Richard Tingle, Apr 26 '16 at 20:41
@RichardTingle Sorry, I actually meant double as in the primitive data type. But there are other classes, like BigDecimal which don't have primitive data types. That's probably what caused me to use Double — Eames, Apr 26 '16 at 21:02
Why do we have `byte` and `short` and `int` when there's `long`? — user253751, Apr 27 '16 at 06:55
32 bit floats are require less memory and are faster, especially using SIMD instructions. — CodesInChaos, Apr 27 '16 at 13:59
I mean, if for no other reason, old data types should be kept for the sake of compatibility with old code (not to mention with old data). I mean, you could manually convert that float representation into double yourself, should you come across one, but it's easier if it's just built in. — Alex, Apr 27 '16 at 15:30
The way this question is worded suggests `float` is somehow older and `double` a "replacement", which is not how history happened. — , Apr 27 '16 at 16:34
A much more fitting question is "why would you remove a keyword and primitive datatype from a language with decades of code that would just break for no reason"? — sara, Apr 27 '16 at 18:02
Imagine the 3D feature in google maps (I worked on something extremely similar). Millions of vertices, each with 5 values (XYZ for position and UV for texture). It is not a difficult optimization to make float work perfectly fine for rendering these models. With that in mind, why would you ever want to download (literally) twice as much data as is necessary to get it working? The difference in size between float and double may seem meaningless, but it adds up quick. — riwalk, Apr 29 '16 at 17:13
As an anecdote - another HNQ http://electronics.stackexchange.com/questions/231705/stm32f4-floating-point-instructions-too-slow has the answer that double was use instead of float by mistake. Granted, it's not java, but c/c++ but still: on some hardware you do want to use float, because double will have to be emulated. — Andrew Savinykh, May 02 '16 at 03:12

Philipp · Accepted Answer · 2016-04-26T17:53:15.383

170

LibGDX is a framework mostly used for game development.

In game development you usually have to do a whole lot of number crunching in real-time and any performance you can get matters. That's why game developers usually use float whenever float precision is good enough.

The size of the FPU registers in the CPU is not the only thing you need to consider in this case. In fact most of the heavy number crunching in game development is done by the GPU, and GPUs are usually optimized for floats, not doubles.

And then there is also:

memory bus bandwidth (how fast you can shovel data between RAM, CPU and GPU)
CPU cache (which makes the previous less necessary)
RAM
VRAM

which are all precious resources of which you get twice as much when you use 32bit float instead of 64bit double.

edited Apr 26 '16 at 17:53

answered Apr 26 '16 at 17:41

Philipp

23,166
6
61
67

2

Thank you! This really helped because you went in depth on what the memory usage changed and why – Eames Apr 26 '16 at 21:06
7

Also, for SIMD operations, 32-bit values can have twice the throughput. As [8bittree's answer](http://programmers.stackexchange.com/a/316854/87195) points out, GPUs have an even greater performance penalty with double precision. – Apr 27 '16 at 02:20
5

Many graphic pipeline even support 16-bit half-floats to increase performance where precision is sufficient. – Adi Shavit Apr 27 '16 at 09:48
"In game development you _usually_ have to do a whole lot of number crunching" (emphasis mine) -> I think this is not the case. Most games are not number-crunchy. – phresnel Apr 27 '16 at 09:56
22

@phresnel All are. You have to move positions, update data and what not. And this is the _simple_ part. Then you have to render (= read, rotate, scale and translate) the textures, distances, get it to the screens format ... There's a lot to do. – Sebb Apr 27 '16 at 13:05
8

@phresnel as a former VP Operations of a game development enterprise, I assure you almost every game there is a ton of number crunching. Note it's usually contained in libraries and 100% abstracted away from the engineer, I would hope they understand and respect that all that crunching is going on. Magic inverse square root, anyone? – corsiKa Apr 27 '16 at 16:41
1

Just in case, like me, you weren't aware of the "magic" inverse square root algorithm @corsiKa mentioned, there's a nice discussion of it here: https://en.wikipedia.org/wiki/Fast_inverse_square_root – Concrete Gannet Apr 28 '16 at 08:00
1

Just looking at hardware will tell you that graphics tend to do a lot of calculations. Their memory is faster than the computer's system RAM and the GPUs are now basically a specialized CPU for floating point calculation. – Nelson Apr 28 '16 at 08:42
1

@phresnel There is a reason why the hardware that's best at number-crunching has "graphics" in its name. And graphics are major part of game development. – svick Apr 28 '16 at 16:09
@Sebb: In mercantile thinking mode, when you need to reduce cost in order to release on time, I wouldn't invoke parallel algorithm and datastructures, hand-coded macro- and micro-optimizations to _move_ a single entity or two, iff not *required*. Maybe you don't consider programs like Tetris or Mahjongg as _games_, then you may be correct. You know, standard software has to move, insert, remove, create, destroy things all the time, too. And if for every piece of code you fully invoke the whole CPU/GPU/etc., the end-user will thank you with bad reviews because of low battery run-times. – phresnel May 03 '16 at 05:30
1

@corsiKa: As someone who just looks a few seconds on Google's Play Service, I assure you that most games there are not number crunching heavy. Likewise in Apple's store. As someone who understands architectures, algorithms and data structures, knows how to write a compiler and optimization passes, experienced in computer graphics (OpenGL, DirectX, Software, Realistic Ray Tracing) and some more stuff, I can assure you that most *games*, including casual games, don't _need_ microoptimized code. I know how to create libraries as you describe, but there are not only graphics or AI heavy games. – phresnel May 03 '16 at 05:37
1

@corsiKa: ... Don't get me wrong: I do not in any way doubt that _your_ company needed such optimizations. But it's for example not the kind of games I see on my family's and friend's mobile phones, it's not even on mine. As you see, I strictly include casual games in my definition of "game". – phresnel May 03 '16 at 05:40
@svick: Likewise, processing in the CPU is what most software does, yet most software is not _heavy_ on the CPU, neither in integer nor in floating point processing. And graphics does in no way imply computing heaviness; Dropbox for example has graphics, too. – phresnel May 03 '16 at 05:42
1

@phresnel My company didn't *need* any kind of optimization. If you are experienced in computer graphics as you say you are, then you know that all that microoptimized code exists inside the OpenGL and DirectX libraries already. The number crunching occurs all under the hood, but it occurs nevertheless. You specifically mention mobile where it actually IS needed to save on battery power, even if the developer doesn't ever deal with it. It's abstracted into very low level libraries and never gets touched by devs, but there is a LOT of number crunching even for simple games. – corsiKa May 03 '16 at 19:19
1

@phresnel You don't do this, but your OS and librarys do. I'm thinking of Tetris etc, but even Tetris has textures. Dropbox uses windows forms/gtk, both of which will invoke graphics librarys, especially for aero. When you just have any external game texturing or rendering library I _assure_ you there's gonna be a lot of number-crunching going on under the hood. And this is _good_ because the GPU is built for number crunching; the CPU is really bad and inefficient if compared. Anything above a standard terminal today will not boot without a significant GPU and there's a reason for that. – Sebb May 03 '16 at 19:21
@corsiKa: There's number crunching, and then there's number crunching. Let me adjust my comment to the point of the post: Performance. Tetris does not do heavy number crunching. And there is no _need_ to use 32 bit floating point in the application layer in such game (of course there's no inherent need for floating point in Tetris). You can convert to 32 bit floats in the moment you invoke APIs. There won't be a noticable performance difference if the game layer utilizes 64 bit floating point math. As I understand the question, the concern is about explitly using floats, not implicitly. – phresnel May 06 '16 at 07:31
@Sebb: If I think of old-style Tetris, it's just bitmaps, not textures, but indeed: gone are the days of good old blitters. I don't deny there _is_ number crunching in the invisible API layer, but for my definition of "most games", most games don't inherently need 32 bit floating points as opposed to 64 bit in the _application layer_ (which the question is about, I think). There would not be a perceivable difference in performance. But because they don't need 64 bit either, 32 bit is just enough to prevent premature pessimization. – phresnel May 06 '16 at 07:40
Can you please take this discussion to chat? I get an inbox message whenever you post and it is really boring. – Philipp May 06 '16 at 07:46
@Phresnel if we're talking about the application layer, you may be right. But you still need collision detection, check if a row is full etc.. But this should really be moved to chat. – Sebb May 06 '16 at 11:15
@Phillip: I am afraid this is the way Stack Overflow works. Our discussion was not off-topic w.r.t. to your answer. If you are unhappy with those inbox signals, go to meta and propose a change or ask a moderator to move it to chat. – phresnel May 09 '16 at 13:17

score 58 · Answer 2 · edited Apr 28 '16 at 09:44

58

Floats use half as much memory as doubles.

They may have less precision than doubles, but many applications don't require precision. They have a larger range than any similarly-sized fixed point format. Therefore, they fill a niche that needs wide ranges of numbers but does not need high precision, and where memory usage is important. I've used them for large neural network systems in the past, for example.

Moving outside of Java, they're also widely used in 3D graphics, because many GPUs use them as their primary format - outside of very expensive NVIDIA Tesla / AMD FirePro devices, double-precision floating point is very slow on GPUs.

edited Apr 28 '16 at 09:44

Pabru

113
2

answered Apr 26 '16 at 17:51

Jules

17,614
2
33
63

8

Speaking of neural networks, CUDA currently has support for half-precision (16-bit) floating point variables, even less precise but with even lower memory footprints, due to the increased usage of accelerators for machine learning work. – JAB Apr 26 '16 at 22:08
And when you program FPGAs you tend to select the amount of bits for both mantissa and exponent manually every time :v – Sebi Apr 28 '16 at 13:04

score 48 · Answer 3 · edited Jun 16 '20 at 10:01

Backwards Compatibility

This is the number one reason for keeping behavior in an already existing language/library/ISA/etc.

Consider what would happen if they took floats out of Java. Libgdx (and thousands of other libraries and programs) wouldn't work. It's going to take a lot of effort to get everything updated, quite possibly years for many projects (just look at the backwards compatibility-breaking Python 2 to Python 3 transition). And not everything will be updated, some things will be broken forever because the maintainers abandoned them, perhaps sooner than they would have because it would take more effort than they want to update, or because it's no longer possible to accomplish what their software was supposed to do.

Performance

64 bit doubles take twice the memory and are almost always slower to process than 32 bit floats (the very rare exceptions being where 32 bit float capability is expected to be used so rarely or not at all, that no effort was made to optimize for them. Unless you're developing for specialized hardware, you won't experience this in the near future.)

Especially relevant to you, Libgdx is a game library. Games have a tendency to be more performance sensitive than most software. And gaming graphics cards (i.e. AMD Radeon and NVIDIA Geforce, not FirePro or Quadro) tend to have very weak 64 bit floating point performance. Courtesy of Anandtech, here's how double precision performance compares to single precision performance on some of AMD's and NVIDIA's top gaming cards available (as of early 2016)

AMD
Card    R9 Fury X      R9 Fury       R9 290X    R9 290
FP64    1/16           1/16          1/8        1/8

NVIDIA
Card    GTX Titan X    GTX 980 Ti    GTX 980    GTX 780 Ti
FP64    1/32           1/32          1/32       1/24

Note that the R9 Fury and GTX 900 series are newer than the R9 200 and GTX 700 series, so relative performance for 64 bit floating point is decreasing. Go back far enough and you'll find the GTX 580, which had a 1/8 ratio like the R9 200 series.

1/32 of the performance is a pretty big penalty to pay if you have a tight time constraint and don't gain much by using the larger double.

note that the performance for 64-bit floating point is decreasing relative to the 32-bit performance due to increasingly-highly optimized 32-bit instructions, not because the actual 64-bit performance is decreasing. it also depends on the actual benchmark used; I wonder if the 32-bit performance deficit highlighted in these benchmarks is due to memory bandwidth issues as well as actual computational speed — sig_seg_v, Apr 26 '16 at 22:35
If you're going to talk about DP performance in graphics cards you should definitely mention the Titan/Titan Black. Both feature mods that allow the card to reach 1/3 performance, at the cost of single precision performance. — SGR, Apr 27 '16 at 08:32
@sig_seg_v There are definitely at least some cases where the 64-bit performance decreases absolutely, not just relatively. See [these results](http://www.anandtech.com/bench/GPU16/1516) for a double precision Folding@Home benchmark, where a GTX 780 Ti beats both a GTX 1080 (another 1/32 ratio card) and a 980 Ti, and on AMD's side, the 7970 (a 1/4 ratio card), as well as the R9 290 and R9 290X all beat the R9 Fury series. Compare that to the [single precision version of the benchmark](http://www.anandtech.com/bench/GPU16/1515), where the newer cards all handily outperform their predecessors. — 8bittree, Jul 26 '16 at 00:12

Kevin J. Chase · Answer 4 · 2016-04-30T10:59:18.393

37

Atomic operations

In addition to what others have already said, a Java-specific disadvantage of double (and long) is that assignments to 64-bit primitive types are not guaranteed to be atomic. From the Java Language Specification, Java SE 8 Edition, page 660 (emphasis added):

17.7 Non-atomic Treatment of double and long

For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.

Yuck.

To avoid this, you have to declare the 64-bit variable with the volatile keyword, or use some other form of synchronization around assignments.

edited Apr 30 '16 at 10:59

answered Apr 27 '16 at 05:53

Kevin J. Chase

479
3
7

2

Don't you need to synchronize concurrent access to ints and floats anyways to prevent lost updates and make them volatile to prevent overeager caching? Am I wrong in thinking that the only thing the int/float atomicity prevents is that they can never contain "mixed" values they weren't supposed to hold? – ASA Apr 28 '16 at 11:33
3

@Traubenfuchs That is, indeed what is guaranteed there. The term I have heard used for it is "tearing," and I think it captures the effect quite nicely. The Java programming language model guarantees that 32 bit values, when read, will have a value which was written to them at some point. That is a surprisingly valuable guarantee. – Cort Ammon Apr 28 '16 at 23:54
This point about atomicity is super-important. Wow, I'd forgotten about this important fact. Counter-intuitive as we may tend to think of primitives as being atomic by nature. But not atomic in this case. – Basil Bourque Apr 30 '16 at 03:14

score 3 · Answer 5 · edited Jun 16 '20 at 10:01

It seems other answers missed one important point: The SIMD architectures can process less/more data depending if they operate on double or float structs (for example, eight float values at a time, or four double values at a time).

Performance considerations summary

float may be faster on certain CPUs (for example, certain mobile devices).
float uses less memory so in huge data sets it may substantially reduce the total required memory (hard disk / RAM) and consumed bandwidth.
float may cause a CPU to consume less power (I cannot find a reference, but if not possible at least seems plausible) for single-precision computations compared to double precision computations.
float consumes less bandwidth, and in some applications that matters.
SIMD architectures may process as much as twice the same amount of data because usually.
float uses as much as half of cache memory compared to double.

Accuracy considerations summary

In many applications float is enough
double has much more precision anyway

Compatibility considerations

If your data has to be submitted to a GPU (for example, for a video game using OpenGL or whatever other rendering API), the floating point format is considerably faster than double (that's because GPU manufacturers try to increase the number of graphics cores, and thus they try to save as much circuitry as possible in each core, so optimizing for float allows to create GPUs with more cores inside)
Old GPUs and some mobile devices just cannot accept double as the internal format (for 3D rendering operations)

General tips

On modern desktop processors (and probably a good amount of mobile processors) you can basically assume using temporary double variables on the stack gives extra precision for free (extra precision without performance penalty).
Never use more precision than you need (you may not know how much precision you really need).
Sometimes you are just forced by the range of values (some values would be infinite if you are using float, but may be limited values if you are using double)
Using only float or only double greatly helps the compiler to SIMD-ify the instructions.

See comments below from PeterCordes for more insights.

`double` temporaries is only free on x86 with the x87 FPU, not with SSE2. Auto-vectorizing a loop with `double` temporaries means unpacking `float` to `double`, which takes an extra instruction, and you process half as many elements per vector. Without auto-vectorization, the conversion can usually happen on the fly during a load or store, but it means extra instructions when you're mixing floats and doubles in expressions. — Peter Cordes, Apr 27 '16 at 19:17
On modern x86 CPUs, div and sqrt are faster for float than double, but other things are the same speed (not counting the SIMD vector width issue, or memory bandwidth / cache footprint of course). — Peter Cordes, Apr 27 '16 at 19:18
@PeterCordes thanks for expanding some points. I was not aware of the div and sqrt disparity — CoffeDeveloper, Apr 28 '16 at 08:03

score 0 · Answer 6 · answered Apr 29 '16 at 17:07

Apart from the other reasons which were mentioned:

If you have measure data, be it pressures, flows, currents, voltages or whatever, this is often done with hardware having an ADC.

An ADC typically has 10 or 12 bits, 14 or 16 bits ones are rarer. But let's stick on the 16 bit one - if measuring around full scale, you have an accuracy of 1/65535. That means a change from 65534/65535 to 65535/65535 is just this step - 1/65535. That's roughly 1.5E-05. The accuracy of a float is around 1E-07, so a lot better. That means you don't lose anything by using float for storing these data.

If you do excessive calculations with floats, you perform lightly worse than with doubles in terms of accuracy, but often you don't need that accuracy, as you often don't care if you just measured a voltage of 2 V or 2.00002 V. Similarly, if you convert this voltage into a pressure, you don't care if you have 3 bar or 3.00003 bar.