Why do CPUs need so much current?

Question

I know that a simple CPU (like Intel or AMD) can consume 45-140 W and that many CPUs operate at 1.2 V, 1.25 V, etc.

So, assuming a CPU operating at 1.25 V and having TDP of 80 W... it uses 64 Amps (a lot of amps).

Why does a CPU need more than 1 A in their circuit (assuming FinFET transistors)? I know that most of the time the CPU is idling, and the 60 A are all "pulses" because the CPU has a clock, but why can't a CPU operate at 1 V and 1 A?
A small and fast FinFET transistor, for example: 14 nm operating at 3.0 GHz needs how many amps (approximately)?
Does higher current make transistors switch on and/or off more quickly?

Modern CPUs (none of which are 'simple') require multiple voltage rails all with their own power requirements. Your question makes many assumptions and has many erroneous statements. You must consider all power requirements and not just those for a single rail. — , Sep 19 '16 at 20:01
Do a FinFET transistor count on a modern CPU. Not *every* FET conducts current from Vdd to ground, but even so, 64 A gets distributed over *a very large number * of these switching FETs. — glen_geek, Sep 19 '16 at 20:02
In certain areas of the CPU, leakage/static currents dominate, like in the 70nm era I have seen 70% of the power consumption of L1/L2 cache being cited as static. — PlasmaHH, Sep 19 '16 at 20:07
If the clock is slow and the CPU is a smaller one, it can operate at 1 W or even less. — Uwe, Sep 20 '16 at 07:10
The total current through all parts of a system is a constant; if a CPU was using 64 amps off the power supply then it would have to be pulling 64 amps out of the wall, but my computer is plugged into a circuit protected by a 15 amp breaker. And there is certainly no way my laptop batteries are providing 64 amps. There's something wrong with your math. Put an ammeter on your computer plug and you'll soon know its amperage. — Eric Lippert, Sep 20 '16 at 11:48
@EricLippert "it would have to be pulling 64 amps out of the wall" - I have a suspicion that the CPU would not be operating on 110 V. — Andrew Morton, Sep 20 '16 at 12:10
The conserved quantity is energy, and on average also power. If a CPU draws 64 Watt, then the power supply must draw _at least_ 64 Watt from the socket. That's <1A even at 110V. — MSalters, Sep 20 '16 at 12:45
@EricLippert The motherboard in your computer contains a multiphase DC to DC converter that steps the supply voltage (12V in the case of a desktop, probably 12-19V in the case of a laptop) down to the core supply voltage. This is done with constant POWER, so the output current ends up being 10-20 times the input current. Not to mention the 12V supply in a desktop computer also comes from a switching power supply which also converts with constant power. The CPU in your computer probably has at least 100 power and ground pins to handle the current. — alex.forencich, Sep 20 '16 at 13:49
@EricLippert: Consider this: a group of 10 capacitors is placed in series, and charged at 1A to 10V each by a >100V supply. Now, transistors between the capacitors switch, and they discharge in parallel, each giving 1A at 10V, for a total of 10A at 10V. The discharge time is then equal to the charge time. Have a second set that charges while the first discharges, and vice versa. DC-DC conversion has a lot of additional complexities to it, but this is the basic concept. — Ben Voigt, Sep 20 '16 at 14:03
@BenVoigt That's the description of a charge pump voltage divider. Typical SMPS supplies used for computers operate completely differently: AC-DC (in the supply block) use flyback transformers, and internal low voltage DC-DC conversions (on the motherboard) uses buck topologies (with an inductor). Your description is right, but it is not what is actually used here. — dim, Sep 20 '16 at 14:20
@dim You are quite correct, but I'm not up to the task of describing a buck converter in the limited space allowed for comments using terms someone without an EE degree would understand. — Ben Voigt, Sep 20 '16 at 14:33
@EricLippert Also, "Put an ammeter on your computer plug" without carefully describing how to use an adequately rated meter to do so is a dangerous suggestion: a naïve user might put the ammeter across the live and neutral terminals of the mains plug, leading to a not-so-fabulous adventure to a hospital accident and emergency department. — Andrew Morton, Sep 20 '16 at 17:44
@AndrewMorton: It would certainly be wise to use an ammeter designed for that task, yes! Jeff explains it here: https://blog.codinghorror.com/why-estimate-when-you-can-measure/ — Eric Lippert, Sep 20 '16 at 18:10
@EricLippert That is not correct. The current through all parts of a series circuit is equal, but the CPU is not in series with the wall supply. (Also, you say "*all* parts of a system"; the power LED is part of the system, but are there 64 amps going through it? Or the CPU heatsink? Or the little screw that holds my graphics card in?) — user253751, Sep 21 '16 at 00:38
"why can't a CPU operate at 1 V and 1 A?" - they can! A CPU can even operate at a few volts and a few milliamps! But not this particular one. This CPU is designed for speed, not low power consumption. — user253751, Sep 21 '16 at 03:50
@AndrewMorton How to build a shunt ammeter widget (USA): as I recall it is 3 feet of #16 wire folded up (not coiled) and bridging the white side of a duplex outlet with the ear broken off. 12 milliohms. 1000 W reads 0.1 AC Volts. — , Sep 21 '16 at 15:52

alex.forencich · Accepted Answer · 2016-09-20T22:17:38.293

CPUs are not 'simple' by any stretch of the imagination. Because they have a few billion transistors, each one of which will have some small leakage at idle and has to charge and discharge gate and interconnect capacitance in other transistors when switching. Yes, each one draws a small current, but when you multiply that by the number of transistors, you end up with a surprisingly large number. 64A is an average current already...when switching, the transistors can draw a lot more than the average, and this is smoothed out by bypass capacitors. Remember that your 64A figure came from working backwards from the TDP, making that really 64A RMS, and there can be significant variation around that at many time scales (variation during a clock cycle, variation during different operations, variation between sleep states, etc.). Also, you might be able to get away with running a CPU designed to operate at 3 GHz on 1.2 volts and 64 amps at 1 volt and 1 amp....just maybe at 3 MHz. Although at that point you then have to worry about whether the chip uses dynamic logic that has a minimum clock frequency, so maybe you would have to run it at a few hundred MHz to a GHz and cycle it into deep sleep periodically to get the average current down. The bottom line is that power = performance. The performance of most modern CPUs is actually thermally limited.
This is relatively easy to calculate - \$I = C v \alpha f\$, where \$I\$ is the current, \$C\$ is the load capacitance, \$v\$ is the voltage, \$\alpha\$ is the activity factor, and \$f\$ is the switching frequency. I'll see if I can get ballpark numbers for a FinFET's gate capacitance and edit.
Sort of. The faster the gate capacitance is charged or discharged, the faster the transistor will switch. Charging faster requires either a smaller capacitance (determined by geometry) or a larger current (determined by interconnect resistance and supply voltage). Individual transistors switching faster then means they can switch more often, which results in more average current draw (proportional to clock frequency).

Edit: so, http://www.synopsys.com/community/universityprogram/documents/article-iitk/25nmtriplegatefinfetswithraisedsourcedrain.pdf has a figure for the gate capacitance of a 25nm FinFET. I'm just going to call it 0.1 fF for the sake of keeping things simple. Apparently it varies with bias voltage and it will certainly vary with transistor size (transistors are sized according to their purpose in the circuit, not all of the transistors will be the same size! Larger transistors are 'stronger' as they can switch more current, but they also have higher gate capacitance and require more current to drive).

Plugging in 1.25 volts, 0.1 fF, 3 GHz, and \$\alpha = 1\$, the result is \$0.375 \mu A\$. Multiply that by 1 billion and you get 375 A. That's the required average gate current (charge per second into the gate capacitance) to switch 1 billion of these transistors at 3 GHz. That doesn't count 'shoot through,' which will occur during switching in CMOS logic. It's also an average, so the instantaneous current could vary a lot - think of how the current draw asymptotically decreases as an RC circuit charges up. Bypass capacitors on the substrate, package, and circuit board with smooth out this variation. Obviously this is just a ballpark figure, but it seems to be the right order of magnitude. This also does not consider leakage current or charge stored in other parasitics (i.e. wiring).

In most devices, \$\alpha\$ will be much less than 1 as many of the transistors will be idle on each clock cycle. This will vary depending on the function of the transistors. For example, transistors in the clock distribution network will have \$\alpha = 1\$ as they switch twice on every clock cycle. For something like a binary counter, the LSB would have \$\alpha\$ of 0.5 as it switches once per clock cycle, the next bit would have \$\alpha = 0.25\$ as it switches half as often, etc. However, for something like a cache memory, \$\alpha\$ could be very small. Take a 1 MB cache, for example. A 1 MB cache memory built with 6T SRAM cells has 48 million transistors just to store the data. It will have more for the read and write logic, demultiplexers, etc. However, only a handful would ever switch on a given clock cycle. Let's say the cache line is 128 bytes, and a new line is written on every cycle. That's 1024 bits. Assuming the cell contents and the new data are both random, 512 bits are expected to be flipped. That's 3072 transistors out of 48 million, or \$\alpha = 0.000061\$. Note that this is only for the memory array itself; the support circuitry (decoders, read/write logic, sense amps, etc.) will have a much larger \$\alpha\$. Hence why cache memory power consumption is usually dominated by leakage current - that is a LOT of idle transistors just sitting around leaking instead of switching.

1V 1A isn't a weird target, ARM CPU's are quite commonly specc'ed as mW/MHz. As a comparison, the whole Raspberry Pi A+ uses 1Watt, including a 700 Mhz CPU - a lot more than the meagre 3Mhz suggested — MSalters, Sep 20 '16 at 12:52
It's more useful to refer to "MIPS per watt", as the amount of work done per clock cycle varies wildly. — pjc50, Sep 20 '16 at 13:12
Well, it depends on what the chip is designed to do. A chip with a TDP of 80W that's designed to run at 3 GHz at 1.2 volts could maybe run on 1V and 1A...but at 1V you're going to have to drop the speed significantly, and to get it to draw 1A you'll have to drop the speed even more. You're not going to get anywhere near 3 GHz in that case. I have no idea what you would actually be able to achieve, though, as I haven't tried it myself. Maybe 3 MHz is a bit pessimistic for an i7 at 1V and 1A. Now, it's certainly possible to design a chip to run at that power level, as you mention. — alex.forencich, Sep 20 '16 at 13:44
They are not simple. In fact they are the one of most complex things we have ever built. — joojaa, Sep 20 '16 at 18:04
Modern Intel/AMD CPUs use at least some [dynamic logic](https://en.wikipedia.org/wiki/Dynamic_logic_(digital_electronics)) that would actually fail to work if clocked too *low*. Intel Skylake (for example) has a minimum efficient frequency/voltage point. To hit even lower power/throughput levels for SoC, it switches a core in and out of sleep at a variable duty cycle (>=800us at maybe ~1GHz (most efficient f), rest in sleep). **See [Efraim Rotem's IDF2015 Skylake power-mgmt talk, at about 53 minutes in](http://myeventagenda.com/sessions/0B9F4191-1C29-408A-8B61-65D7520025A8/7/5#sessionID=155)** — Peter Cordes, Sep 20 '16 at 20:01
Yeah, that's a good point, too. Modern CMOS is rather picky! — alex.forencich, Sep 20 '16 at 20:09
@alex.forencich One quick question, The decoupling capacitors that are recommended by the controller manufacturer. Are they designed for the highest recommended controller frequency? Since the speed of the controller could be varied, will this need to be considered? — seetharaman, Sep 22 '16 at 10:06
Bypassing has to account for the entire range of frequencies. Not just the frequency of operation directly, but the short pulses that get drawn even within clock cycles. Generally there will be multiple different sets of bypassing techniques used - bypassing on the die itself, bypassing on the package, bypassing on the board with planes, and bypassing on the board with caps. The mfr will be responsible for everything on the die and package and should have guidelines for what is required on the board. — alex.forencich, Sep 22 '16 at 13:23

score 17 · Answer 2 · answered Sep 20 '16 at 14:02

17

According to Wikipedia, top CPUs released in 2011 had some 0.5 to 2.5 billions of transistors. Assuming a CPU with 1 billion of transistors consumes 64A of current, the average current is only 64nA per transistor. Considering operation frequencies of several GHz, it's actually surprisingly little.

answered Sep 20 '16 at 14:02

Dmitry Grigoryev

25,576
5
45
106

Is for higher operating frequency of CPU required higher current? – Lucenzo97 Sep 20 '16 at 19:26
2

Generally current \$I \approx I_0 + kf_CV^2\$ where fc is the clock frequency and k is a constant and V is the operating voltage, and I0 is the leakage current. k will vary depending on how many transistors are switching at a given time as well as with the chip design. – Spehro Pefhany Sep 20 '16 at 19:28
4

At this point, we can put more transistors on a CPU than we can use at the same time without melting it. So at any given time, a large fraction of the chip is [Dark Silicon](https://en.wikipedia.org/wiki/Dark_silicon): not powered up, but sitting there waiting to be used while other parts of the chip (with different specialized functions) are powered down. e.g. the vector floating point hardware, the vector integer multipliers, and the vector shuffle units can't all be saturated at once, but they each have high throughput when used alone. Also, large caches don't switch much. – Peter Cordes Sep 20 '16 at 20:34
1

This is a big factor in CPUs gaining more and more specialized hardware, like AES and SHA crypto instructions, and Intel's BMI2 (especially [PEXT / PDEP bit-extract/deposit](https://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets#Parallel_bit_deposit_and_extract)). Something to do with the transistor budget that can speed up some workloads but doesn't have to be powered on when not in use. – Peter Cordes Sep 20 '16 at 20:37

Why do CPUs need so much current?

2 Answers2

Linked