22

I know absolutely nothing in low-level stuff, so this will be a very newbie question. Please excuse my ignorance.

Is machine language - the series of numbers to that tell the physical computer exactly what to do - always binary? I.e. always composed of only zeros one ones? Or could it also be composed of numbers such as 101, 242, 4 etc.?

Aviv Cohn
  • 21,190
  • 31
  • 118
  • 178
  • 7
    There are some computers that use [ternary logic](https://en.wikipedia.org/wiki/Ternary_computer). Related Stack Overflow question: [Why binary and not ternary computing?](http://stackoverflow.com/q/764439/289086) –  Apr 20 '14 at 00:47
  • @MichaelT That's pretty interesting! – WendiKidd Apr 20 '14 at 00:49
  • @WendiKidd there is some application when you get into [optical computing](http://iopscience.iop.org/1402-4896/2005/T118/025) because you can have polarization as a 'value': off, clockwise, counterclockwise (three values). There's also the entire numeral system for [balanced ternary](https://en.wikipedia.org/wiki/Balanced_ternary). Its a bit odd to think about - but its there. –  Apr 20 '14 at 00:57
  • 8
    Previous generations of digital computers were not binary but decimal (http://en.wikipedia.org/wiki/Decimal_computer). Also, computer may not necessarily be digital, but also analog (http://en.wikipedia.org/wiki/Analog_computer) – Thiago Silva Apr 20 '14 at 02:58
  • I was prototyping a compiler while ago, for Intel processors, and almost everything was written in _Hexadecimal_ base in the Intel's manuals. So even tho that my compiler had to write everything in a base two format, I was reading and writing them in base sixteen; Which was a huge advantage. – Mahdi Apr 20 '14 at 09:42
  • @Thiago-Silva but is there anything for analog computers that you'd call machine language? What I've read of that wikipedia page doesn't look like it's talking about programmable devices at all. – mc0e Apr 20 '14 at 14:51
  • Octal used to be another popular representation of machine instructions, too. – sea-rob Apr 20 '14 at 16:37
  • To answer directly -- no. The vast majority of "modern" machines use binary, but in the past many schemes -- decimal, bi-quinary, centessimal, et al -- were used. And even in modern machines multi-level signalling may be used in memory chips. As to the instructions/opcodes that drive the computer, they generally (at least for von Neumann designs) utilize the same coding scheme as the numeric values, meaning that most modern computers use binary but various other schemes have been used in the past. – Daniel R Hicks Apr 21 '14 at 03:08

6 Answers6

48

Everything in a computer (to be precise, in any typical contemporary computer) is binary, at a certain level. "1s and 0s" is an abstraction, an idea we use to represent a way of distinguishing between two values. In RAM, that means higher and lower voltage. On the hard drive, that means distinct magnetic states, and so on. Using Boolean logic and a base 2 number system, a combination of 1s and 0s can represent any number, and other things (such as letters, images, sounds, etc) can be represented as numbers.

But that's not what people mean when they say "binary code." That has a specific meaning to programmers: "Binary" code is code that is not in text form. Source code exists as text; it looks like a highly formalized system of English and mathematical symbols. But the CPU doesn't understand English or mathematical notation; it understands numbers. So the compiler translates source code into a stream of numbers that represent CPU instructions that have the same underlying meaning as the source code. This is properly known as "machine code," but a lot of people call it "binary".

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
Mason Wheeler
  • 82,151
  • 24
  • 234
  • 309
  • Thanks for answering. I was reading some small article, some introduction to low level stuff (this is the best way I can put it..), and the author said that some specific list of numbers, is a list of instructions to the CPU. The list looked something like this: 204, 112, 312, 481, 411. This is not in binary form. How can the CPU understand this? – Aviv Cohn Apr 20 '14 at 01:06
  • 4
    @Prog: Like I said, all numbers have a binary representation. 202 is `11001010`, for example. But the 1s and 0s are an abstraction. Instruction #204 could mean "load a value from memory into a CPU register," for example. – Mason Wheeler Apr 20 '14 at 01:18
  • Frankly, If I wanted to be *really* pedantic, I'd point out no computers actually use "digital" logic, they actually (at the electrical engineering level) are **all** *fundamentally* analog devices, that simply operate in a bi-stable mode that is close enough to the digital approximation to make it a useful analogy. – Fake Name Apr 20 '14 at 11:29
  • Of course, if you want much greater depth regarding the actual hardware implementation of a computer, you can always come over to the [Electrical Engineering](http://electronics.stackexchange.com/) stack exchange and ask us. – Fake Name Apr 20 '14 at 11:35
  • 1
    @MasonWheeler Assuming the number 202 (11001010) tells the computer to "load a value from memory into CPU register": Does this happen by the computer actually executing the binary code? I.e. executing the series of ones and zeros as their physical meaning? 1 - high volatege, 1 - high voltage, 0 - low voltage, 0 - low, 1 - high, 0 -low, 1 - high, 0 - low? Or is it just an arbitrary number? – Aviv Cohn Apr 20 '14 at 12:26
  • @prog: No; reading instructions one bit at a time and making decisions based on it would take too long, and CPUs are designed to be fast. When reading machine code, it's like Whatsisname said in his answer: the computer doesn't look at individual bits any more than you or I read by looking at individual letters. They're arbitrary numbers that map to values using an agreed-upon mapping. Sometimes the meaning of the machine code is hard-wired into the CPU transistors, and sometimes it's actually [programmed in!](http://en.wikipedia.org/wiki/Microcode) – Mason Wheeler Apr 20 '14 at 13:05
  • @MasonWheeler If the numbers that form machine language are just arbitrary, and interpreted by the CPU to some meaning - then it means that machine code is not the lowest level..? I mean, of course it's the lowest level. But if additional interpretation is needed for the CPU to understand what each arbitrary number means in terms of what the physical computer needs to do, than it seems to me it's not the lowest level. The lowest level is actually telling the computer "do a high voltage, do a low voltage". Am I wrong? – Aviv Cohn Apr 20 '14 at 13:10
  • 2
    @Prog: There's always a lower level. Machine code is the lowest level that programmers have to worry about, but below that is microarchitecture (the stuff you're talking about), which is based on transistors and logic gates, which are based on the principles of electronics, which are based on the laws of physics and quantum mechanics. There isn't really something that says "do a high voltage, do a low voltage", because it's not always a voltage. (For example, when you store it to disc, it's a magnetic state instead.) That's why we use `1` and `0` as convenient abstractions. – Mason Wheeler Apr 20 '14 at 14:47
  • @MasonWheeler So I understand that the CPU needs to interpret the numbers in the machine language (203, 410, etc). Is this interpretation, written in software? Or is it purely physical? – Aviv Cohn Apr 20 '14 at 16:43
  • 2
    @Prog: That depends on the CPU. Some CPUs have it hard-wired into the transistors, others are programmed in microcode. Some do both. – Mason Wheeler Apr 20 '14 at 17:16
  • Thanks for your help, one last question: From what you're saying, I understand that *there is no way to directly control the actions of the physical computer by programming*. The most control a programmer has on the physical computer, is telling it to make an action of a predefined set of actions offered by the computer. The programmer could say `274` to make the computer store a number in a registry, or `412` to do some other predefined action. *But the programmer has no way to actually control **how** the computer executes these actions physically.* Not even with machine language. – Aviv Cohn Apr 20 '14 at 22:31
  • Machine language is simply composed of instructions for a set of predefined instructions that do specific things, offered by the computer. The programmer can't, for example, tell the computer "turn that voltage thing in the chip on" (*yeah, that's my understanding of low level pretty much*), if this actions wasn't offered as a predefined action by the computer's manufacturer. Correct? – Aviv Cohn Apr 20 '14 at 22:34
  • @Prog: That's right. Our abstractions are *way* beyond that stage now. Heck, there are some cases where the computer doesn't even execute machine code exactly literally as it's written, due to superscalar architecture--the CPU optimizes things internally to run faster. – Mason Wheeler Apr 20 '14 at 22:34
  • Okay just to summarize: machine language is still far more high-level than actual physical operations, right? Meaning, the 'command' `318` in raw machine code doesn't map 1:1 to any physical operation in the physical computer right? It may move a value from one location to another, but that's still abstractions and not physical terms. So that means that the assumption 'machine code directly controls the physical computer' - is false. Right? – Aviv Cohn Apr 20 '14 at 22:42
  • 4
    @Prog: it depends. Simpler/older processors (especially RISC ones) have a machine code with a stronger connection to the operations actually performed inside the processor. In modern x86 processors instead it's all fake - the x86 opcodes are mostly just a compatibility layer that hides the actual microarchitecture of the processor, which is so allowed to change in each new generation of CPUs. These processors internally do any kind of tricks - pipelining, instruction reordering, branch prediction, ... so what happens at the physical level is quite distant from the assembly you may write. – Matteo Italia Apr 21 '14 at 00:15
  • 3
    -1 - Certainly many older machines used decimal or bi-quinary or other such schemes, and even in modern machines it's quite likely that some of the memory chips use multi-level signalling vs pure binary. – Daniel R Hicks Apr 21 '14 at 03:02
  • 1
    @DanielRHicks Just curious, do you know of an example where some modern machines use a non-binary signaling system (not counting machines that deal with analog data)? – awksp Jun 06 '14 at 15:47
  • @user3580294 - Don't know of any modern machines. I wouldn't be surprised if some "programmable logic controllers" and the like were decimal (to make programming from a 10-key pad easier), but I don't work in that area so couldn't point to any. – Daniel R Hicks Jun 06 '14 at 16:09
  • @DanielRHicks Ah, OK. Would have been fun to find out how non-binary logic was hiding away in computers somewhere. Thanks though! – awksp Jun 06 '14 at 16:18
25

Let's have a look at an actual machine instruction. Suppose we have an ARM CPU and we want to add 143 to the value in register 2, placing the result in register 1. In ARM assembly language that's written

ADD  R1, R2, #143

This assembly instruction can be encoded as a single machine instruction. The specification of how that's done is on physical page 156 of the ARM ARM, the amusingly-named Acorn RISC Machine Architecture Reference Manual. It's also necessary to look at the definition of "shifter operand", which begins on physical page 444.

 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
|    Cond   | 0  0  I  0  1  0  0  S|     Rn    |     Rd    |          shifter operand          |

As you seem to already understand, machine instructions are numbers, and on the ARM, they are numbers of a fixed size: 32 bits, divided into several fields. To encode the above ADD, we fill in the fields like this:

| cond | fmt | I | opcode | S | Rn | Rd | rot | imm |
|    E |  00 | 1 |  0100  | 0 | 2  | 1  | 0   | 143 |

(The "shifter operand" got divided into "rot" and "imm" because I set I=1.) Now, to make that into a single 32-bit number, we have to expand it out to binary, because many of the fields are not tidy numbers of bits long:

 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
  1  1  1  0  0  0  1  0  1  0  0  0  0  0  1  0  0  0  0  1  0  0  0  0  1  0  0  0  1  1  1  1

To humans that is a big blur; hexadecimal is easier for us to understand:

1110  0010  1000  0010  0001  0000  1000  1111
   E     2     8     2     1     0     8     F

And so usually, in communication with other humans, we say that the "machine instruction" corresponding to ADD R1, R2, #143 is the hexadecimal number E282 108F. We could equally say that it is the decimal number 3,800,174,735, but that obscures the pattern of fields more than hex does. (Someone with a lot of practice debugging on the bare metal on ARM would be able to pick condition code E, source and destination registers 2 and 1, immediate operand 8F = 143 out of E282 108F with relative ease.)

All of the above representations encode the same machine instruction! I have only changed how I wrote it down.

In terms of "ones and zeroes", if you load a program containing this instruction into RAM on a real computer, somewhere in memory the bit pattern 1110 0010 1000 0010 0001 0000 1000 1111 will appear (possibly backwards, because of endianness). But it is equally valid to say that somewhere in memory the hexadecimal number E282 108F, or the decoded instruction ADD R1, R2, #143 appears. Bit patterns in RAM have no meaning in themselves; meaning comes from context. Conversely, that bit pattern / hexadecimal number isn't necessarily an instruction at all! It would also appear in a program that made use of the unsigned 32-bit integer 3,800,174,735, or the single-precision IEEE floating point number -1.199634951 × 1021 as data.

zwol
  • 2,576
  • 1
  • 16
  • 16
  • So my understanding is, you and the CPU agree on some specific instructions that are in almost any computer composed of binary bits, like 0111. So 0111 is an abstraction for a specific operation, and that operation in turn is an abstraction for a series of high-voltage/low-voltages switches the CPU has to perform in order for that operation to occur ? – doubleOrt May 11 '18 at 21:08
  • Let me word my question differently: "CPU tells me if you want me to do x, give me series y of bits", and basically, when you give series y of bits to the CPU, it then understands your command and performs some 0/1 switches to make it happen, right ? – doubleOrt May 11 '18 at 21:10
  • @Taurus It's a little unusual to talk about it that way because it's so low-level, but, yes. Continuing with the `ADD R1, R2, #143` example, the CPU's "control logic" will process the bit pattern `E282 108F` by making electrical connections that feed the "immediate" value 143 and the value stored in R2 into a binary adder, and then feed the result of the addition into R1. To learn more about how this works, read up on [digital logic](https://www.springer.com/us/book/9783319568379) and then [microarchitecture](http://www.powells.com/book/-9780070570641). – zwol May 11 '18 at 21:23
10

Whenever someone uses the phrase "the ones and zeros" in most contexts, especially this context, they are, in my opinion, significantly misrepresenting what's going on, and thus leading to confusion.

The computer doesn't really just read "the ones and zeros" any more than when you read a book, you are reading "the letters". Sure, both are strictly true, but those statements are leaving out a substantial piece of information: the structure of each.

In the case of English, the letters are structured into words, and the words make up sentences, according to a set of rules. The order of letters in words and the order of words in sentences can completely change the meaning.

A similar process is in play with computers and with machine language. The computer looks at the ones and zeros in discrete chunks, in bytes, and in groups of bytes.

Other posters have mentioned various ways that numbers can be encoded as individual bits. There are ints, floating point, text strings, etc, which give structure to the stream of bits and bytes.

Ultimately, the computer is conceptually looking at groups of bits, so it's rarely ever looking at "10101010", its looking at 101, 242, or 4, etc. What those numbers mean, depends on their context in the given 'sentence' they are part of.

whatsisname
  • 27,463
  • 14
  • 73
  • 93
5

All numbers stored in most computers are technically stored in a binary form. At a hardware level everything is represented as a series of high and low voltage signals. High voltage signals are ones/true values, low voltage signals are zeros/false values. These are the bits (short for binary digits) mentioned when talking about 32 bit or 64 bit machines. The number (32,64) in this case refers to how many bits can be addressed out of memory at a time.

So in most modern computers the machine code is just normal values stored in memory, but all of memory is made of bits.

4

Almost all "computers" these days use binary logic. However, the meaning of "computer" post-WW II has come to mean a computing device with persistent storage and stored programs, rather than just a simple computing engine like a calculator.

A few examples of the exceptions are:

  • There might be a few odd-ball trinary (or more) logic systems in labs.
  • There are a few analog computing systems are in use.
  • An example of a future high performance computing system not using binary logic might be the D-Wave quantum annealing systems.
Scott Leadley
  • 226
  • 1
  • 4
1

Machine language is not an universal language but rather a strictly CPU-related language - the language the CPU understands.

You can design a CPU that has 42 states instead of 2 states for the smallest element of memory. The problem is that you cannot come with a good enough implementation for such a CPU. Actually, some of the first computers (including ENIAC) were decimal computers that implicitly used a decimal machine language.

The fact that it's decimal or binary or other value depends on the number of states the smallest element of memory (a bit) can take; 2 was not chosen for CPU design purposes but rather limited by electronic implementation: a transistor operates much better and faster with only 2 levels of voltage instead of 10 (or any other natural number larger than 2).

Random42
  • 10,370
  • 10
  • 48
  • 65