24

Computer programmers often recite the mantra that x86 instructions are totally opaque: Intel tells us they are doing something, but there is no hope that anyone can verify what's happening, so if the NSA tells them to backdoor their RNGs, then we can't really do anything about it.

Well, I believe that computer programmers can't do anything about this problem. But how would an electric engineer attack it? Are there techniques an electrical engineer could use to verify that a circuit actually performs the operations described in its spec, and no other operations?

user14717
  • 367
  • 2
  • 5
  • 5
    You'd have to do something like xray the die and analyze everything to see what it's actually doing. Basically reverse engineer the chip and account for trhe function of every circuit. Totally impractical. – DKNguyen May 22 '19 at 14:12
  • 7
    No electrical circuit performs to an exact spec because of noise and the slight possibility that one day there will be a glitch that is "big enough". – Andy aka May 22 '19 at 14:13
  • 1
    There are techniques and tools to reverse chips - xray as said, microscopes and such. Modern microprocessors are *extremely* complex beasts, so such a work will be extremely difficult, but possible.. – Eugene Sh. May 22 '19 at 14:16
  • 1
    This is very much like trying to reconstruct the Windows source code from its kernel image. Possible in theory, true, but way too much effort... – michi7x7 May 22 '19 at 14:59
  • 5
    Fun info: This is vaguely related to [Laplace's demon](https://en.wikipedia.org/wiki/Laplace%27s_demon). – Harry Svensson May 22 '19 at 15:33
  • 1
    as a fascinating opposing view: Companies spend lots of effort making tamper resistant CPUs. They'll do fun things to make a chip erase itself or even destroy itself if you attempt to open it up (really useful if you have your keys on a smart card). The fact that they take this effort suggests that is possible to glean information from the CPU directly. As another interesting note, in a purely software world, consider the famous [Ken Thompson login hack](http://cm.bell-labs.com/who/ken/trust.html) – Cort Ammon May 22 '19 at 23:07
  • It's only impossible if you don't have a complete design for the CPU, which you don't have unless you work on the CPU team at Intel. – user253751 May 23 '19 at 00:17
  • 6
    It's going to be easier to steal internal documents from Intel's content database than it would be to reverse engineer even a single modern, complex Intel CPU. – forest May 23 '19 at 00:44
  • 2
    *In principle*, an evil silicon engineer might devise a pathway that only becomes active after many years due to ion migration in aging silicon dies. This may enable a feature or a change in behaviour that was previously impossible to observe. Don't have nightmares though :D –  May 23 '19 at 13:15
  • He would start by getting an EE degree. I'm sorry, but it's apparent you don't have a foggy clue how CPUs work, and are just repeating scare stories. Right off the bat, CPUs are too primordial to conceal something as complex as a crypto backdoor (that couldn't be detected by testing). Such a thing would be hidden in software, which is very easy to reverse engineer **if you try**. To start with, you'd use an open source OS, ideally one you compiled yourself, rather than a closed-source Windows box... There's so much you can do; how about doing some of it... – Harper - Reinstate Monica May 23 '19 at 14:30
  • 15
    @Harper your attitude is unconstructive, and your assertion that a backdoor can't be concealed in hardware is not true. – pjc50 May 23 '19 at 15:46
  • 3
    @Harper: Modern x86 CPUs have are not simple, and have lots of firmware, not just pure fixed-function hardware. Some complex instructions are microcoded. For example, Intel was able to add brand-new functionality into existing CPUs by releasing a microcode update to add Spectre mitigation features. The Write-model-specific-register (`wrmsr`) instruction is basically a hook that lets x86 code "call" into arbitrary new microcode with the MSR number as a "call number". Mitigation for RIDL vulns even modified a rarely used microcoded unpriviledged instruction (`verw`) to add semantics to it. – Peter Cordes May 23 '19 at 20:19
  • 3
    As far as back-doors in general, you could easily imagine a microcode update that "listened" for a sequence of [`verw` or `verr`](https://www.felixcloutier.com/x86/verr:verw) instructions with specific operands, and dropped the CPU into system-management mode (ring -1, above even normal kernel or hypervisor privilege) still executing the code that "knocked" with the right code. The ultimate local privilege escalation exploit on any system where you can run machine code in user space. The NSA sneaking something like that into Intel's microcode is not beyond the realm of possibility. – Peter Cordes May 23 '19 at 20:22
  • 2
    @user14717: *x86 instructions are totally opaque* not entirely. For many of the simpler ones, there are performance counters that let us figure out how they work. e.g. [`xchg` is 3 uops on Intel CPUs](//stackoverflow.com/q/45766444) and we can run some experiments to find out that the dst -> src direction has 1c latency and the other direction has 2c latency, presumably with a `mov` uop to an internal-use-only register reserved for use by microcode. We know (from Intel patents) a fair bit about internals. We can't truly trust things that we can't confirm by experiment, though. – Peter Cordes May 23 '19 at 20:27
  • Notice that in most cases it is even sort-of impossible to tell what the *software* really does. Even open source software can contain overlooked "bugs", and even the scenario of a malicious compiler inserting malicious code into executables compiled from perfectly safe source code has been suggested. While the behaviour of software is likely much easier to verfiy than hardware, it's still not often done. Why? Bad cost/risk relation. – JimmyB May 24 '19 at 12:28

6 Answers6

24

Are there techniques an electrical engineer could use to verify that a circuit actually performs the operations described in its spec, and no other operations?

In theory, yes, I think this is possible. However, for a complex CPU it will take a lot of time and money. Also, if you do not fully know and understand the design, you will be unable to judge if any activity is "legit" or not.

A CPU is "just" a complex digital circuit consisting of many logic cells.

It is possible to reverse engineer the chip and reconstruct the design by observing the metal connections. There can be many of these connection layers like up to 8 layers or more.

You will need experts in the field to recognize the logic cells and then maybe some software can figure out how they're all connected so you can reconstruct the netlist.

Once you have the netlist you "know" the design. That doesn't mean you now also know how it works!

It could be that a certain function activates 2 sections of the design while you think one should be enough so you then suspect some suspicious activity is going on. However, the design does some clever trick you do not know about to speed up operations.

Without knowing and understanding the design, any conclusion you draw might still be wrong. Only the engineers who designed the CPU have all the design information and stand the best chance of being able to figure out or guess what actually goes on or should go on in a CPU.

Bimpelrekkie
  • 80,139
  • 2
  • 93
  • 183
  • 77
    *Only the engineers which designed the CPU know everything that goes on* - I happen to be an engineer working in this industry and let me assess this statement as being a very optimistic one :) – Eugene Sh. May 22 '19 at 14:27
  • @EugeneSh. :-) You have a point, I'll change that to *Only the engineers who designed the CPU might know everything that goes on* – Bimpelrekkie May 22 '19 at 14:31
  • 18
    No, the CPU designers *would not* know everything that goes on - design at that level is dependent on synthesis tools, and those could inject behavior beyond that in the HDL design. To take a non-nefarious example, a lot of FPGA tools will let you compile in a logic analyzer. – Chris Stratton May 22 '19 at 16:02
  • 9
    Reverse engineering a chip with "billions of transistors" would present a challenge. https://spectrum.ieee.org/semiconductors/processors/3d-xray-tech-for-easy-reverse-engineering-of-ics – Voltage Spike May 22 '19 at 17:26
  • Why can't people other than the engineers who designed the CPU know what's going on? Efforts like Visual6502 let even ordinary people like me have a look. – Lorraine May 23 '19 at 07:50
  • 4
    @Wilson Because complex circuits (including CPUs) will contain many proprietary (and secret, trademarked / patented even) designs which are not made available to the general public because the companies which own those designs want to benefit (earn money) from them. The 6502 is an **old design**, it does not have any valuable design information anymore so yeah, that's fully open and available to everyone. – Bimpelrekkie May 23 '19 at 07:57
  • 3
    @Bimpelrekkie: If they're patented, they're by definition not secret. That's the point of a patent. You trade a secret for a temporary monopoly. – MSalters May 23 '19 at 12:00
  • 2
    @MSalters Of course you're right, then read my "patented" as "to be patented". I'd also like to remark that even when an idea is patented it does not mean "all" in formation is revealed. It is possible some implementation details are kept secret, the general idea is patented but not all details are shared. – Bimpelrekkie May 23 '19 at 12:05
  • @Bimpelrekkie Aside: "Trademarked" isn't a term that's applicable, since trademark is entirely about company image, not actual product designs – Delioth May 23 '19 at 14:50
  • 1
    @Wilson, the 6502 has 3510 transistors total. I wouldn't be surprised to find that a single ALU of a single core of a modern CPU has many times that many transistors -- not to mention the transistors in the branch predictor, the caches, the instruction decoders, the register files, the FPU, the instruction dispatchers, the retirement unit, and many other pieces that the 6502 simply didn't have. Modern CPUs are *complicated*. – Mark May 23 '19 at 23:24
14

The best paper I have read on the subject is "Stealthy Dopant-Level Hardware Trojans" (Becker et al) from 2014.

Since the modified circuit appears legitimate on all wiring layers (including all metal and polysilicon,) our family of Trojans is resistant to most detection techniques, including fine-grain optical inspection and checking against “golden chips." We demonstrate the effectiveness of our approach by inserting Trojans into two designs — a digital post-processing derived from Intel’s cryptographically secure RNG design used in the Ivy Bridge processors and a side-channel resistant SBox implementation — and by exploring their detectability and their effects on security.

The paper describes how the change is made, how it's extremely hard to detect from inspecting the silicon, techniques for hiding it from the production test, and how it can be made to either reduce the security of a hardware crypto RNG or to leak key information through a power-rail side-channel of an AES implementation.

Side-channels are an emerging field of interest. Intel have been plagued by problems relating to speculative execution leaking information from memory that wasn't even being used by the program. Could that have been a deliberate design flaw? It's almost impossible to tell.

JRE
  • 67,678
  • 8
  • 104
  • 179
pjc50
  • 46,540
  • 4
  • 64
  • 126
  • Wouldn't a side channel require some sort of transmitter to send the information to the NSA? Otherwise I surely would notice someone measuring the power rail current on my laptop while I'm working on it. – Dmitry Grigoryev May 29 '19 at 07:47
9

Well, I believe that computer programmers can't do anything about this problem. But how would an electric engineer attack it?

There are not good ways to find back doors, one way to find a hardware backdoor would be to test combinations or undocumented instructions. Here's a good talk of someone who actually does this and does audits on x86 hardware. This can be done without cracking the chip. One problem with intel (I'm not sure about other chips) is it actually has a processor with linux running on it so there is also software running on some processors, and you don't have access to that supposedly.

Are there techniques an electrical engineer could use to verify that a circuit actually performs the operations described in its spec, and no other operations?

There are ways to test to use the hardware itself to test functionality. Since x86 has an undocumented portion of its instruction set, it would be unusual to introduce backdoors in normal instructions because it would introduce the possibility of bugs (like if you had a backdoor in an add or mult instruction), so the first place to look would be in the undocumented instructions.

If you did need to test the functionality of regular instructions you could watch the time it takes to execute instructions, watch the amount of power it takes to run instructions to see if there are differences from what you'd expect.

Voltage Spike
  • 75,799
  • 36
  • 80
  • 208
  • I don't get why you'd backdoor undocumented instructions. You want to backdoor features people actually use! – user14717 May 22 '19 at 15:30
  • 3
    I would disagree, its not impossible that someone would do this, but unlikely. lets say you backdoored an regular instruction like an add instruction, and if you executed an additional instruction lets say it opened a backdoor. Then a customer develops a program that has exactly that combination, they look into it, find the back door and everyone gets mad and you get sued. Much safer to put a backdoor in the undocumented instructions (or the linux computer built into CPU's) – Voltage Spike May 22 '19 at 15:38
  • Then when do the backdoored instructions get executed? Do they backdoor the compiler? – user14717 May 22 '19 at 15:47
  • 4
    IME runs Minix which is not Linux and is much smaller and simpler. Linux was inspired by the existence of Minix and originally used its filesystem and was announced on its newsgroup, but they were quite different then and are extremely so now. – Chris Stratton May 22 '19 at 16:07
  • 5
    @user14717 - the *nasty* possibility would be a trigger sequence in a jailed native executable, something like native client. But there's no reason it has to be *code* and not *data*. – Chris Stratton May 22 '19 at 16:10
  • 5
    @laptop2d Bugs where CPUs don't do what the theoretical documentation of the instruction set say happen *all the time*; nobody gets sued, usually: Read the [errata section](https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/7th-gen-core-family-spec-update.pdf) in the intel 7th gen Core i7 family doc update, for example. Using an undocumented instruction would immediately sound the alarm of any malware researcher. Using an unusual combination of rhythmic ADDs with the right inter-register MOVs is less likely to trigger any alarm. – Marcus Müller May 22 '19 at 16:46
  • 3
    Point being that the "normally used" x86_64 instruction set is so large already that arbitrary sequences of say 10 instructions are already so incredibly combinatorically unlikely that you can safely use them in a malware program without anyone ever noticing. – Marcus Müller May 22 '19 at 16:47
  • @MarcusMüller Yeah, instead of saying they get sued, I should have said its bad for business, an obscure bug -- not bad for business. A hidden backdoor that compromises security easily on any chip -- bad for business. – Voltage Spike May 22 '19 at 16:54
  • 3
    @laptop2d that's why I'd argue that if anyone catches a program that has an illegal instruction actually leading to privilege escalation, that'll be really bad for business, because it's virtually 100% certain that there was malicious intent on the silicon side. "If I read the stack pointer twenty-three times in 100 instructions, with at least one arithmetic instruction between each MOV, followed by a JMP to code that should trigger a system interrupt, but doesn't" might be easier to sell as bug. – Marcus Müller May 22 '19 at 16:57
  • @ Marcus Muller Such as you just suggested have the burden of recognizing STATE, and that requires STORAGE. – analogsystemsrf May 22 '19 at 18:22
  • One possibility would be a "knock, knock" pattern of NOP instructions. A pattern of precisely timed NOPS triggers the backdoor, perhaps remapping a relatively unused opcode to perform a different function while executing just that thread. Farfetched, but doable. – Byron Jones May 22 '19 at 19:10
  • 1
    @analogsystemsrf such storage already exists, and already causes nefarious bugs: see for example [this CPU bug](http://gallium.inria.fr/blog/intel-skylake-bug/), description: – mbrig May 23 '19 at 01:35
  • `SKL150 - Short loops using both the AH/BH/CH/DH registers and the corresponding wide register *may* result in unpredictable system behavior. Requires both logical processors of the same core (i.e. sibling hyperthreads) to be active to trigger, as well as a "complex set of micro-architectural conditions"` – mbrig May 23 '19 at 01:36
  • 6
    @laptop2d I was stun by the "embedded linux within the CPU" statement. So I made a bit of research, I guess you talk about the Intel ME engine. Well, it doesn't run on the CPU itself, but on the north bridge chipset. It seems there has been a lot of misinformation about that, see https://itsfoss.com/fact-intel-minix-case/ – dim May 23 '19 at 08:26
  • 1
    @dim: The power-management controller is on the same die as the main cores, and IIRC "has as many transistors as a 486". It may actually *be* an x86, like a quark core or something. *It* doesn't run Linux either; I think I've read minix. Actually it might be the same Intel ME that you're talking about. It has had vulns, like https://www.intel.com/content/www/us/en/support/articles/000030482/software/chipset-software.html. Intel since Nehalem hasn't had a separate "northbridge", the memory controller is integrated and the CPU has some PCIe lanes directly + DMI to southbridge. – Peter Cordes May 23 '19 at 20:36
6

The only way would be to strip down the chip layer by layer and record every transistor with an electron microscope, and then enter that into some kind of simulation program and then watch it run.

This is essentially the Black Box problem in which you try and reconstruct the internals from measuring inputs and outputs. Once the complexity of the internals, or number of I/O, gets beyond the trivial there is a combinatorial explosion where the number of possible internal states becomes astronomical. Where numbers like Googol get thrown about.

Voltage Spike
  • 75,799
  • 36
  • 80
  • 208
Dirk Bruere
  • 13,425
  • 9
  • 53
  • 111
  • 2
    ...and it is easier to steal the design using social engineering :) – Eugene Sh. May 22 '19 at 14:20
  • 8
    No. The glaring mistake here is that *simulation* would not be sufficient. Even if you were *given* an *accurate* simulation model, you still would not be able to find carefully hidden behavior, because you have no idea how to *trigger* it. – Chris Stratton May 22 '19 at 15:58
  • 4
    @ChrisStratton I wouldn't call that mistake *glaring*. It's a reasonable assumption that the design was based on doing simplifications that are physically usual, e.g. that you don't put two metallization traces so close together that they couple inductively sufficiently to change the state of a MOSFET gate. That is only a mistake if a) your simplifications don't match the physical model of what the designer used or b) the designer is intentionally hiding something by intentionally breaking the requirements for these simplifications in non-obvious ways. – Marcus Müller May 22 '19 at 16:39
  • Exactly! The *logic* simulator makes the assumption that the actual hardware can be modeled as kind of logic gates – that's not actually the case, unless you actually design the hardware to do that. Which you, usually, strive to do (e.g. you'd use voltages that are safe to switch a mosfet, not something close to the threshold; you'd use widths so that leakage currents can't become critical); and probably, given complexity, fail to completely achieve, usually, because your logic model is an (over)simplification of reality. However, for as long as your design is sound and adheres to the … – Marcus Müller May 22 '19 at 16:50
  • …constraints you need for the simplified model to still match real behaviour, you'd be fine reverse-engineering it based on the same model. However, as said, it's not necessarily so that you actually adhere to these constraints (they might be unknown, even to the silicon fab) or that you *want* to universally adhere to them. – Marcus Müller May 22 '19 at 16:51
  • 7
    @ChrisStratton ah, sorry, ok, I think now I'm getting your point. You say that even the digital/behavioural clocked models of a CPU are complex enough to hide cases where the programmer's understanding / assumptions simply do not apply. That's true. One could have documented the effects leading to SPECTRE in excruciating detail, and most people would have never thought of caching to having data- or program flow-relevant side effects. Indeed! – Marcus Müller May 22 '19 at 16:54
  • That's an interesting example because it was an *unintended* situation rather than a carefully hidden intentional one. – Chris Stratton May 22 '19 at 16:55
  • 3
    Thanks :) Your argument brings the whole topic of formal verification of the correctness of ISAs back into view ("does this ISA actually guarantee that a compliant CPU does not grant RING 0 privileges to unprivileged code?") and of formal verification of HDL/RTL against such ISA specifications (I like this [RISC-V CPU Core verification](https://github.com/SymbioticEDA/riscv-formal) project especially.) – Marcus Müller May 22 '19 at 17:00
  • 2
    @MarcusMüller: indeed, all the effects leading to Spectre *were* known to computer-architecture people, just nobody had realized it added up to an attack vector. Modern branch predictors usually let aliasing branches "fight" over the prediction, instead of an occasional cold branch erasing the history for a hot branch. And especially ITTAGE (in Haswell and later) index the BTB according to past branch history, so the predictions for a single branch are spread out over lots of space (great for latching onto complex patterns over a few branches). And OoO exec optimistically anticipates success. – Peter Cordes May 23 '19 at 20:54
  • 1
    It had previously been assumed that microarchitectural state wasn't privileged, only the actual data. The idea of triggering mis-speculation to turn secret data into microarchitectural state + a side channel (e.g. cache timing, but others are possible) to read that into architectural state made everyone realize that we're screwed, and there's no easy fix even in hardware. The assumption that mis-speculation is fine as long as it doesn't *directly* become wrong architectural state (register or memory values) is fundamental to modern OoO CPU design. Spectre was obvious once pointed out! – Peter Cordes May 23 '19 at 21:04
  • 1
    Fortunately, unlike Spectre, the rest of the Meltdown and RIDL vulnerabilities are easy to fix in hardware, just by making load ports squash the result to 0 in cases that will fault if they reach retirement. Intel CPUs currently just let load results for cache-miss loads (or other load uops that will fault or need to get replayed) take data from whatever happened to get muxed into the output, like a privileged line in L1d cache, or a cache-line-split load buffer, or whatever. Again the same assumption that this can't become architectural which the same side-channels attack. – Peter Cordes May 23 '19 at 21:12
5

Proving that the CPU isn't doing something sneaky is extraordinarily hard. The classic example is a voting machine. If it has a single bit in it that takes a copy of your vote and later sneaks it out to some dictator, it could be life or death for you in some places. And proving there isn't a single bit like that in among the billions is rather hard.

You might think about isolating the chip physically, so it is practical to see that there are no improper wire connections to it. And putting another chip, or more than one chip in series (from different sources) in its network connection that guarantees it only connects to the right place. Then power cycling it after it has delivered your vote. And hoping that there are no nonvolatile bits in there. Or sneaky wireless connections. But would you trust your life to it?

emrys57
  • 1,042
  • 7
  • 12
5

Transmitting any data to the NSA will require network access, so it will be quite easy to spot such a backdoor by running an OS with network services disabled and checking the network interfaces for traffic. For an open-source OS it's even possible to run with full network support and spot rogue connection by their destination IP which will not match any address the OS could legitimately access.

A backdoor based on RNG with no data transmission will have very limited usefulness. Unless the CPU RNG is the only entropy source, the chances that such backdoor will provide any advantage to the attacker while not being obvious at the same time is practically zero. Unless you insist that Russel's teapot is out there despite having no good reason to exist, you should be able to apply the same argument to the hardware RNG backdoors.

Dmitry Grigoryev
  • 25,576
  • 5
  • 45
  • 106
  • 5
    So you assume that the adversary has the time, money, and skill to create and hide a hardware trojan horse, but the first thing they do is telnet www.nsa.gov? This seems like a very naive point of view. – Elliot Alderson May 23 '19 at 16:23
  • 1
    If the NSA had hidden a vulnerability, then yes they would be hoping that people used `rdrand` or `rdseed` as Intel suggested: as the only entropy source for a PRNG seed. Linux (the kernel) chose not to do that for `/dev/random`, but glibc / libstdc++'s current `std::random_device` *does* use just `rdrand` if it's available at runtime instead of opening `/dev/random`. [Step into standard library call with godbolt](//stackoverflow.com/a/56246283) – Peter Cordes May 23 '19 at 20:44
  • @ElliotAlderson What's your point of view then? How can someone steal valuable data without ever transmitting it somewhere? – Dmitry Grigoryev May 29 '19 at 07:39
  • @PeterCordes `std::random_device` is not a cryptographically strong RNG. C++ standard allows you to implement it with a PRNG, effectively returning **the same sequence every time**, so it's quite obvious nobody should use it for encryption. – Dmitry Grigoryev May 29 '19 at 07:41
  • Oh right, I forgot there's no guarantee that it's any good, xD. It *is* good on many implementations, but MinGW is the standout exception to the design intent that it gives you as good quality random numbers as the platform is capable of, defeating the main purpose of the library. (Which as you say is *not* crypto, but seeding PRNGs for other purposes). ([Why do I get the same sequence for every run with std::random\_device with mingw gcc4.8.1?](//stackoverflow.com/q/18880654)). That would be acceptable on a platform without any entropy (minimal embedded device), but not on x86 Windows! – Peter Cordes May 29 '19 at 07:45
  • My point is that "transmitting it somewhere" does not imply creating a network connection to an obvious adversary website. Do some investigation of codename TEMPEST. – Elliot Alderson May 29 '19 at 11:02
  • @ElliotAlderson Could you elaborate? Don't get me wrong: I like Shakespeare, but I doubt that reading it once again will help me understand your point. – Dmitry Grigoryev May 29 '19 at 11:38
  • Open a new internet browser window. Type in the URL duckduckgo.com. Press the 'Enter' key. In the search box, type "codename tempest". Press the 'Enter' key. Left-click on the article titles to read the articles. – Elliot Alderson May 29 '19 at 11:42