52

When teaching recently about the Big vs. Little Endian battle, a student asked whether it had been settled, and I realized I didn't know. Looking at the Wikipedia article, it seems that the most popular current OS/architecture pairs use Little Endian but that Internet Protocol specifies Big Endian for transferring numeric values in packet headers. Would that be a good summary of the current status? Do current network cards or CPUs provide hardware support for switching byte order?

Ellen Spertus
  • 728
  • 1
  • 6
  • 13

4 Answers4

33

I'd argue that it's not so much won as ceased to matter. ARM which makes up basically all of the mobile market is bi-endian (oh, the heresy!). In the sense that x86 basically "won" the desktop market I suppose you could say that little endian won but I think given the overall code depth (shallow) and abstraction (lots) of many of today's applications, it's much less of an issue than it used to be. I don't recall endianness really coming up in my Computer Architecture class.

I suspect that many developers aren't even aware of endianness or why it's important. Because for the vast (and I mean vast) majority it's utterly irrelevant to their daily working environment. This was different 30 years ago when everyone was coding much closer to the metal as opposed to manipulating text files on a screen in fancy and dramatic ways.

My general suspicion is that Object Oriented Programming was the beginning of the end of caring about endianness since the layers of access and abstraction in a good OO system hide implementation details from the user. Since implementation includes endianness, people got used to it not being an explicit factor.

Addendum: zxcdw mentioned portability being concern. However, what has arisen with a vengeance in the last 20 years? Programming Languages built on virtual machines. Sure the virtual machine endianness might matter but it can be made very consistent for that one language to the point where it's basically a non-issue. Only the VM implementors would even have to worry about endianness from a portability standpoint.

  • 3
    There are still many very relevant domains in which it matters, for example when writing *any form* of portable code. Infact, where it probably does not matter is when writing *non portable* code which is tied to a platform. – zxcdw Sep 23 '12 at 18:48
  • @zxcdw which leads us directly to the army of virtual machine languages out there... I hadn't thought of that. –  Sep 23 '12 at 19:53
  • 2
    Your addendum is not entirely true (and neither do I agree with @zxcdw): endianness matters only when translating between multibyte integers and byte streams, and becomes a problem when it's done implicitly and varies between platforms. Most modern languages (whether VM-based or not) achieve portability by having you do it rarely (with integers as an opaque datatype), and then have endianness either specified independant of platform, or explicitly chosen by the programmer. – Michael Borgwardt Sep 23 '12 at 20:06
  • @MichaelBorgwardt The said translation between multibyte integers may happen when simply running the software on another platform. This is a major concern when working with native code when platform isn't fixed. Of course, considering that it's 2012 and more and more people are moving to higher levels of abstraction this issue can be considered fading away and considered insignificant for huge, huge majority of programmers. – zxcdw Sep 23 '12 at 20:20
  • @zxcdw: are there actually platforms that support the same machine language but differ in endianness? – Michael Borgwardt Sep 23 '12 at 20:31
  • 2
    @MichaelBorgwardt ARM does http://www.arium.com/pdf/Endianness.pdf –  Sep 23 '12 at 20:35
  • @MichaelBorgwardt Why would the machine language matter? If you write software for say x86 which is little endian and port that to say Motorola 68000 which is big endian, then your multi-byte integers as data will get mixed. Not to mention bi-endian architectures. – zxcdw Sep 23 '12 at 22:20
  • @zxcdw: "working with native code" implies to me machine language, not porting. And if we're talking about data in files or network protocols, that must of course always have explicitly specified endianness. Then there should be no problems at all when code that handles it is run on different platforms - even C code. Native code on bi-endian architectures is of course a different issue. – Michael Borgwardt Sep 23 '12 at 22:34
  • 2
    @zxcdw - even in assembler, you don't always need to know the endian order. Constants, for example, don't need to be specified a byte at a time. The situation is somewhat similar to a certain style of serialization in C - `x & 0xFF` always gives you the least significant byte irrespective of endian ordering (assuming your bytes are 8 bits each) because you've specified the bits you're interested in by their value, not their relative position in memory. –  Sep 24 '12 at 04:42
  • A "mobile ARM", whether in a OSX, Android, or another Linux based system, is Little Endian. The "Big Endian" mode is theoretical. – MSalters Sep 24 '12 at 09:30
  • Speaking of VM languages, JVM is big endian, .net allows the underlying hardware to shine through and provides a way to check. BitConverter.IsLittleEndian. – stonemetal Sep 24 '12 at 12:53
  • @stonemetal: In what way would JVM allow endianness to matter, given that code which tries to read half of a `long` as an `int` will be rejected, and code which tries to write half of a `long` as an `int` will only be accepted if no attempt is made to read the `long` afterward. – supercat Jul 15 '14 at 23:53
  • @stonemetal: By what mechanism can the JVM convert a sequence of one data type to/from a longer/shorter data type? I am unaware of any JVM instructions that would allow code to interpret elements 8-11 of a `byte[]` as an `int`, or store an `int` into four consecutive elements of a `byte[]`. A library might include methods to perform such conversions for big-endian but not little-endian format, but I would think that would be a property of the library rather than the JVM. Does the JVM offer anything internally? – supercat Jul 16 '14 at 04:38
  • @supercat - it's not part of the JVM *specification*, but HotSpot has instrinsics that perform the non-typechecked access methods from the `sun.misc.Unsafe` class (e.g. `public native byte getByte(Object o, long offset`) so in a sense the JVM is able to perform such translations. However, the use of these methods is specified as producing undefined behaviour when the type of access is not as declared. Similarly, it provides intrinsics for `long allocateMemory()`, `putInt(long, int)` and `getByte(long)`, but again there is no specification provided of how using these behaves. – Jules Sep 17 '17 at 06:36
8

Endians only really matters when you are transferring binary data systems.

With the advancement of processor speed (and much much lower cost of storage) binary data interfaces are becoming rarer so you don't notice them at the application layer. You are either using a textual transfer format (XML/JSON) or you are using data layer abstraction that takes care of the translation for you (so you don't even notice that there is a translation).

But when you are coding at the binary data layer you do notice and it is very important. For example When I worked at VERITAS (Symantec now) I was building software that was being built on 25 different hardware platforms (not only big/little endian there are other types).

Martin York
  • 11,150
  • 2
  • 42
  • 70
  • My students have also developed for mobile phones and used cloud computing, so they know the world is not PCs and Macs. – Ellen Spertus Sep 24 '12 at 00:56
  • @Loki - it's possible to serialize and de-serialize without knowing the endian of the machine. You only really need to know the byte-ordering of the data in the files/streams/whatever. For example, `(char) (x & 0xFF)` in C gives you the least significant byte irrespective of endian issues, assuming only that a byte is 8 bits. I've designed binary file formats without knowing the machines that the software would run on - I basically chose an endian ordering for the file format without caring about the hardware. –  Sep 24 '12 at 04:50
  • @espertus: Sure possible. – Martin York Sep 24 '12 at 05:16
  • 1
    @Steve314: Yes of course you can. When you are working on the "Binary Data Layer" you can devise whatever scheme you want to serialize your data and it is not hard to devise schemes that that are portably. Though personally I would not bother to re-invent a wheel that has been built and well tested since the 60's. Look up `[h2nl](http://pubs.opengroup.org/onlinepubs/7908799/xns/htonl.html) and family. this family of functions provide a portable (standard) way of doing things that is optimal for your platform. – Martin York Sep 24 '12 at 05:23
8

No, nobody has won. We as a species we have failed to standardize the order in which we store our bytes, along with the direction we write and the side of the street we drive on.

As a consequence, anyone who wants to transfer data between two different systems over a network or in a file, has only about a 50% chance of the reasonable initial version of their data dumping code being correct in their environment, and even if it works, has a 50% chance of working in their customer's.

To deal with this you need to go look up platform specific functions with names like "htonl" in headers with names obviously dating back to the 70's like "arpa/inet.h", because the situation has not improved since then and probably never will.

Andrew Wagner
  • 181
  • 1
  • 3
  • 24
    turns out we have standarised - instead of sending 4 bytes to represent an integer, we send a block of text formatted with special header text, angle brackets, keywords and an ASCII representation of those 4 bytes. The receiving end then parses the formatting to get the integer text and converts it back into 4 bytes. This is called progress, I'm told :-) – gbjbaanb Oct 29 '15 at 10:53
  • 1
    $ aptitude search xml | wc -l 677 – Andrew Wagner Oct 29 '15 at 14:14
2

There is still no consensus:

  • The majority of larger computer systems (server/desktop/laptop) currently use little-endian architectures
  • The majority of smaller computers (tablets/phones) use an endianness-independent processor architecture, but run operating systems that use little-endian order

So at the hardware level, LE is far more common. But:

  • Most inter-computer communication is carried out using protocols that specify big-endian order
  • A very large proportion of the world's software runs on a virtual platform that defaults to big-endian order whenever data is written to external storage.

Both orders are going to be with us for the foreseeable future.

Jules
  • 17,614
  • 2
  • 33
  • 63
  • 1
    The majority of the _largest_ systems (i.e., "big iron") is typically big-endian. That is, so-called mini or mainframe systems (which make up a huge amount of the backend processing most of us don't care about.) –  Oct 29 '15 at 13:26
  • @jdv But most the *largest* **computing systems** are little endian x86-64 machines, and there, performance matters. – user877329 Sep 16 '17 at 15:03
  • I don't think anyone can make any strong assertions that endianness is anything more than convenience on the part of the architecture designers (for whatever they want to achieve). At the time I made that ancient comment, big iron was BE. But this is not because it is BE, but because the architecture happens to be that way. –  Sep 16 '17 at 20:36