Endianess in Computers

The other month, I was working on some code where I had to be concerned about the endianess of the data.  I came across this article from IBM which greatly cleared up some things(I got to the article at least partially from this Stackoverflow quesetion).

Now I’m going to deviate here as that was almost a month ago and I forget what I was going to talk about here.

Anyway, the point here is that when you do any sort of bit-shift or bit-masking, it is always in the code as big-endian format.  When you do an operation, it is always big-endian but how it gets stored differs on what machine you are on.  To get the actual layout of the byte as they are in-memory, I generally cast the data as a uint8_t*.  So for example, let’s take the following code:

uint32_t value = 0x12345678;
uint8_t* value_memory = (uint8_t*)&value;

for( int x = 0; x < 4; x++ ){
    printf( "0x%02X ", value_memory[ x ] );

With this code, there will be two different outputs depending on what endianess your machine is in.  On a big-endian machine(like MIPS) this will print out something like the following:

0x12 0x34 0x56 0x78

However, on a little-endian machine(x86, ARM) it will print out the values in the opposite direction:

0x78 0x56 0x34 0x12

The thing to remember here is that the ‘value’ variable, when printed out on both big and little-endian systems, will print out as the same value.

If you need to force a particular byte order, make an array in the code(this is shown in the IBM article).  If you need to convert a particular byte sequence into the proper endianess for your processor, you can simply OR the two bytes together.

For example, let’s assume that we get the following two bytes in big-endian order:

uint8_t bytes[] = { 0x12, 0x34 };

To convert this to your processor endianess as a 16-bit value, you only have to OR the two bytes together, and since OR-ing will always act as though the bytes were in a big-endian format, your data has magically become the proper endianess.

uint16_t proper_endianess = (bytes[ 0 ] << 8 ) | bytes[ 1 ];

Anyway I hope that somewhat clears some things up.  (Although I think I may have simply confused people more).

Leave a Reply

Your email address will not be published. Required fields are marked *