Let's have a look at an actual machine instruction. Suppose we have an ARM CPU and we want to add 143 to the value in register 2, placing the result in register 1. In ARM assembly language that's written
ADD R1, R2, #143
This assembly instruction can be encoded as a single machine instruction. The specification of how that's done is on physical page 156 of the ARM ARM, the amusingly-named Acorn RISC Machine Architecture Reference Manual. It's also necessary to look at the definition of "shifter operand", which begins on physical page 444.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
| Cond | 0 0 I 0 1 0 0 S| Rn | Rd | shifter operand |
As you seem to already understand, machine instructions are numbers, and on the ARM, they are numbers of a fixed size: 32 bits, divided into several fields. To encode the above ADD, we fill in the fields like this:
| cond | fmt | I | opcode | S | Rn | Rd | rot | imm |
| E | 00 | 1 | 0100 | 0 | 2 | 1 | 0 | 143 |
(The "shifter operand" got divided into "rot" and "imm" because I set I=1.) Now, to make that into a single 32-bit number, we have to expand it out to binary, because many of the fields are not tidy numbers of bits long:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
1 1 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 1 1 1
To humans that is a big blur; hexadecimal is easier for us to understand:
1110 0010 1000 0010 0001 0000 1000 1111
E 2 8 2 1 0 8 F
And so usually, in communication with other humans, we say that the "machine instruction" corresponding to ADD R1, R2, #143
is the hexadecimal number E282 108F
. We could equally say that it is the decimal number 3,800,174,735, but that obscures the pattern of fields more than hex does. (Someone with a lot of practice debugging on the bare metal on ARM would be able to pick condition code E, source and destination registers 2 and 1, immediate operand 8F = 143 out of E282 108F
with relative ease.)
All of the above representations encode the same machine instruction! I have only changed how I wrote it down.
In terms of "ones and zeroes", if you load a program containing this instruction into RAM on a real computer, somewhere in memory the bit pattern 1110 0010 1000 0010 0001 0000 1000 1111 will appear (possibly backwards, because of endianness). But it is equally valid to say that somewhere in memory the hexadecimal number E282 108F
, or the decoded instruction ADD R1, R2, #143
appears. Bit patterns in RAM have no meaning in themselves; meaning comes from context. Conversely, that bit pattern / hexadecimal number isn't necessarily an instruction at all! It would also appear in a program that made use of the unsigned 32-bit integer 3,800,174,735, or the single-precision IEEE floating point number -1.199634951 × 1021 as data.