1

I have a query regarding big endian and little endian.

Basically the conversion is used to reverse the byte order in memory .

When we need to do the conversion, do we need convert each and every data types?

gnat
  • 21,442
  • 29
  • 112
  • 288
user2720323
  • 991
  • 3
  • 12
  • 11
  • 3
    Sharing your research helps everyone. Tell us what you've tried and why it didn’t meet your needs. This demonstrates that you’ve taken the time to try to help yourself, it saves us from reiterating obvious answers, and most of all it helps you get a more specific and relevant answer. Also see [ask] – gnat Oct 25 '13 at 07:39
  • 1
    Usually you don't need to convert endinaness of integers, you need to read them from the byte sequence using the desired endianness. – CodesInChaos Oct 25 '13 at 09:05

3 Answers3

3

The answer is: it depends.

If you're sharing data with another platform, or (de)serializing some binary format with defined endianness, then you need to match that platform's or format's endianness.

Assuming that target endianness is well defined (and different from your native endianness), that will tell you what conversions you need.

Oh, and I'd suggest using htons, htonl and friends rather than twiddling the bytes manually - they're more likely to get optimized to a single BSWAP instruction or similar.

Useless
  • 12,380
  • 2
  • 34
  • 46
2

The internal storage of the native integer format is one aspect of the internal binary representation of numbers on various processors (two's complement vs. one's complement is another). The position of the most-significant to least-significant bytes is an important aspect of data storage and transport. For example byte order is a concern when transferring data between two systems using a binary data format (examples: XDR, Xupl, UBJson, etc).

We can examine 32-bit (b32) and 16-bit (b16) words on a machine. They are stored in either four bytes (byte[4]) or two bytes (byte[2]), and can be examined as an array of bytes. Consider the arrangement of the bytes in a 32-bit word (byte[4]). Label the bytes A,B,C,D, and there are 4! = 24 possible permutations,

ABCD ABDC ACBD ACDB ADBC ADCB 
BACD BADC BCAD BCDA BDAC BDCA 
CABD CADB CBAD CBDA CDAB CDBA 
DABC DACB DBAC DBCA DCAB DCBA 

Two of those permutations, ABCD and DCBA, are often used on processors, and we call these orderings the 'endianness' of the processor. That is to say that the order of the bytes defines the endianness of the processor. Suppose label A is most significant and D is least significant, then these two formats have special names,

  • Big-endian: ABCD, byte significance decreases as memory address increases
  • Little-endian: DCBA, byte significance increases as memory address increases

The 16-bit case is much simpler, only two permutations, AB and BA.

Other formats are used; the pdp-11 had a middle-endian layout, BADC.

There are many processors (ARM, PowerPC, Sparc9+, Alpha, Mips, Pa-Risc, IA-64) which can switch between big-endian and little-endian.

Here is a short C-program that will tell you the endianness of your processor,

#include <stdio.h>
#include <string.h>
typedef unsigned long ulong;
typedef unsigned char uchar;
int
main(int argc, char *argv[])
{
    uchar word[4] = {(uchar)0x01,(uchar)0x23,(uchar)0x45,(uchar)0x67};
    ulong be = 0x01234567;
    ulong le = 0x67452301;
    ulong me = 0x23016745;
    ulong we; ulong ue;
    memcpy(&we,word,4);
    if( we == be ) printf("Big-endian\n");
    if( we == le ) printf("Little-endian\n");
    if( we == me ) printf("Middle-endian\n");

    char UNIX[4+1]="UNIX";
    ue  = ((ulong)'U'); ue<<=8;
    ue += ((ulong)'N'); ue<<=8;
    ue += ((ulong)'I'); ue<<=8;
    ue += ((ulong)'X');
    printf("%s = %.4s\n",UNIX,(char*)&ue);

    int ndx;
    uchar *p = word;
    printf("@%x:\n", p );
    for( ndx=0; ndx<sizeof(we); ndx++ )
    {
        printf("[%02x] %03d:%02x\n", ndx, p[ndx], p[ndx] );
    }

}
ChuckCottrill
  • 525
  • 3
  • 8
1

the conversion itself is usually some fancy bit twiddler solution like: (for a 32 bit integer)

i = ((i&0xff000000) >>> 24) | ((i&0x00ff0000) >> 8) | ((i&0x0000ff00) << 8) | ((i&0x000000ff) <<24)

why it is needed is that some architectures are big endian and some are little endian and if they want to communicate through a byte stream they need to agree on endianness on the byte stream so that a 1 doesn't become a 16,777,216 (2^24)

the conversion is only used for primitives that are larger than 1 byte; the other (composite) types are generally defined as a sequence of other composite types and/or primitives that will remain in order

for example the LAS file format is defined in little endian. this means the first header bytes in order are:

  1. first char of signature, (L)
  2. second char of signature, (A)
  3. third char of signature, (S)
  4. fourth char of signature, (F)
  5. low byte of file source ID,
  6. high byte of file source ID,
  7. low byte of global encoding,
  8. high byte of global encoding,

and so on

ratchet freak
  • 25,706
  • 2
  • 62
  • 97