My doubt is: since all the java code is converted to native in android
is there any use of JNI/NDK and what is it?
Control over memory is the biggest one for me. I've seen Java do a very competent job of register allocation and instruction selection but there's one glaring difficulty for performance, and that's the overhead associated with objects and the loss that gives you over memory layout and access patterns. Still, you can make Java code a lot more performant than usual and gain back a lot of the control lost if you can lean more heavily on plain old data types and not objects.
For example, in C you can do things like this:
struct Vector
{
// AoS
float xyzw[4];
};
And create an array of those and be guaranteed that the contents will be contiguous with a stride that's always sizeof(struct Vector)
which will generally exclude padding in this case and just be sizeof(float)*4
or exactly 128 bits. You can also heap allocate that aligned to 128-bit boundaries and then be able to use aligned moves to SIMD registers and vectorize your code with efficient intrinsics. Similar thing in C++ where you can make a vector class with methods and not pay any overhead for the convenience and still guarantee that an array of those will be contiguous.
However, if you try to create a Vector
class in Java, then you gain no such control, each Vector
object will have some additional meta information associated for things like reflection and dynamic dispatch which will inflate the size of a Vector
. If you try to create an array of those, it'll be similar to creating an array of pointers at which point you pay additional memory and indirection costs of the pointers on top of the class metadata for each Vector
instance, and the contents of the array won't be guaranteed to be contiguous (they likely will be initially if you allocate each individual Vector
sequentially all at once, but after the first GC cycle, they can then be fragmented in memory). That can be a performance killer with the loss of temporal and especially spatial locality, leading to many more cache misses than necessary.
That said, often there's plenty of room for Java code to go a whole lot faster. If you just work with a giant array of float
, for example, instead of an array of Vector
, you avoid all the overhead above and can be guaranteed that the contents of the array will remain contiguous. I actually think many people stand to make their Java applications a whole lot performant without reaching for the JNI if they could just work with arrays of plain old data types, not objects, for the areas they're tempted to implement in JNI. For convenience you can make a Vectors
class which stores a bunch of vectors at once (the object overhead will then be trivial if it's only paid once for a hundred vectors, e.g., and not per-vector) and provides operations on them, but internally just represents those vectors as a big array of float
.
I've even seen a reasonable interactive CPU path tracer implemented in Java (no native API calls for the path tracing itself, including the BVH). It was surprisingly fast, especially for a teeny one-man amateur project, but when I peered at the source code, that's exactly what it did. It avoided objects for performance-critical hot data in favor of giant arrays of plain old data types. It avoided even using Vector
and Matrix
objects, instead just using arrays of floats and index ranges. Of course the implementation wasn't so pretty and looked a lot like old unstructured C code, but that was only for the low-level critical paths and data that was accessed billions of times over. The high-level part of the application was still modeled with objects. Their implementation details, however, avoided them.