2

There is an assembler that I am writing that is located within the file asm.c located in this repository. It uses the instruction set located in the specs file to produce an output binary. (The program that would run this binary has not yet been made - the beginning is located in main.c.) Using the example program named echochar.sdmasm, the assembler outputs the desired binary. Here it is in hex:

90 00 a0 00

But it only does this so far on a Windows machine under Cygwin. (I have not yet tested it under Linux.) On an Intel-based Mac, this is the resultant binary:

00 90 00 a0

This looks like a difference in endianness, but I thought that this could only happen when two processors are completely different. This seems to be and endian difference between operating systems, not processors. Is this really the case, or is something else going on here that I am not getting?

Just managed to test it on Linux - the output error occurs as it does on the Mac.

Okay, something else is going on entirely. Output from hd on Linux:

00000000  00 90 00 a0                                       |....|
00000004

Output from hexdump on Linux:

0000000 9000 a000                              
0000004

This is really odd. I can't tell which one is the correct output.

Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
  • 1
    Are you sure whatever tool you're using to dump the file isn't doing some undetected byte-swapping? – Mike Harris Dec 11 '15 at 18:44
  • 1
    Endianness isn't necessarily a function of the processor. Network transport protocols always use Big-Endian. Processors deal with orthogonal endianness by having byte swapping routines. – Robert Harvey Dec 11 '15 at 18:54
  • @MikeHarris I was using the standard hexdump utility. – LordCreepity Dec 11 '15 at 20:09
  • @RobertHarvey some processors are bi-endian. e.g. ARM which the OS chooses IIRC. – Mgetz Dec 11 '15 at 20:49
  • 3
    `hexdump` interprets bytes two by two by default (it defaults to `-x`) so on a little endian CPU, it will reverse bytes two by two. `hd` is just a symlink to `hexdump` but causes it to default to `-C`, making it interpret the input one byte at a time. – user2313067 Dec 11 '15 at 20:56
  • 1
    `addressString = strtok(NULL, delimitor);` ... hmm ... ?? – Petr Vepřek Dec 11 '15 at 21:56
  • @PetrVepřek This line grabs a hexadecimal number from a line of assembly code. The first call to `strtok()` decodes the instruction, this call decodes the hex number the instruction manipulates. It could be an address, or just data. – LordCreepity Dec 12 '15 at 13:03
  • 1
    @LordCreepity, you are right, I overlooked the first call. – Petr Vepřek Dec 12 '15 at 13:30

2 Answers2

2

Something else is going on, likely a bug in the code.

It looks like an off-by-one error to me, not an endianness change.

Check your type widths and your bit shifts.

Lightness Races in Orbit
  • 8,755
  • 3
  • 41
  • 45
0

Given your expected output as shown above (i.e. 90 00 etc), your code is incorrect - it assumes it is running on a big-endian machine. I would suggest either changing your fwrite to a couple of fputc calls, or alternatively assembling your instructions byte-by-byte in an array of uint8_t.

As to the discrepancies of output, it seems some of the viewer programs you are using are reading 16-bit words in little-endian mode and others are reading byte-by-byte, thus the discrepancy. This is made more confusing by the fact that your output is not what you expect, probably causing you to look in the wrong place for the issue.

Jules
  • 17,614
  • 2
  • 33
  • 63
  • I'm using `write()`, not `fwrite()`. – LordCreepity Dec 12 '15 at 13:10
  • `write(outputImageFD, &instruction, sizeof(int16_t));` -- this will write 2 bytes of int16_t in the native order of the machine it runs on. On LE, 0x9000 will first write 0x00 and then 0x90. On BE, first 0x90 and then 0x00. – Petr Vepřek Dec 12 '15 at 13:26
  • 1
    Hex viewer should display bytes in the stored order. However, as @Jules said, if the viewer tries to interpret the byte data as something else (e.g. int16_t) the displayed value may vary. If the viewer runs on same endianness platform, the value will match what was stored in the file. If however, the writer (asm.c) and the viewer run on different endianness platforms, then the values will differ. – Petr Vepřek Dec 12 '15 at 13:27