I would imagine the reason was fast, array like access to the character at index, but some characters won't fit into 16 bits, so it wouldn't work...
So if you have to handle special cases anyways, why not just use UTF-8?
I would imagine the reason was fast, array like access to the character at index, but some characters won't fit into 16 bits, so it wouldn't work...
So if you have to handle special cases anyways, why not just use UTF-8?
Because it used to be UCS-2, which was a nice fixed-length 16-bits. Of course, 16bit turned out not to be enough. They retrofitted UTF-16 in on top.
For the main part, for the sake of plain and simple future-proofing. Whether it was a misguided reason and the wrong way to go about it is a different question.
You can see some reasons behind some of their design decisions in this document about the 2004 switch to Java 5 and UTF-16, which explains some of the shortcomings as well: Supplementary Characters in the Java Platform, and see Why does the Java ecosystem use different encodings throughout their stack?.
For more details on the pitfalls of using UTF-16, and why UTF-8 is likely to be a better option in general, see Should UTF-16 be considered harmful? and the UTF-8 Everywhere manifesto.