Is CPU/GPGPU heat dissipation quadratic in clock frequency?

Question

In the following article on overclocking a Pentium from 1GHz to 5GHz using liquid nitrogen: The 5GHz Project, there is an assertion that "Heat dissipation rises exponentially during extreme overclocking". However, in this post on CPU power heat: How are the CPU power and temperature caculated/estimated?, it appears that work is linearly proportional to frequency, as follows:

To work out the work done whenever the gate changes state you can model it as a capacitor with some effective capacitance, \$C_g\$, and you get:

$$W = \frac{1}{2}C_gV^2$$

and the power is the work per state change times the number of state changes per second, so:

$$P_g \propto C_gV^2f$$

If you add up all the logic gates in the processor you can define an effective total capacitance, \$C\$, that will be the sum of all the gate capacitances, \$C_g\$, so:

$$P \propto CV^2f$$

The 5GHz project page then states "In the past we recorded about 135 watts using the Chip-con compressor at 4.1 GHz. Using our nitrogen cooling to break the 5 GHz sound barrier would produce peak heat dissipation of up to 180 watts emitted from a die surface area of 1.12 square centimeters. Applied to our example that means 1.6 MW per square meter."

I asked the above question in the context of the question How are the CPU power and temperature caculated/estimated?. My follow-on question was deleted but I got this reply before it was deleted:

The formula only suggests a linear increase with respect to frequency if the voltage is held constant, which is not going to be true for large overclocks such as this one. The CPU becomes unstable at high frequencies, which can be partially compensated for by increasing the voltage. Thus, for large overclocks, the power grows faster than linearly because of this accompanying voltage contribution. I don't think it's literally exponential, however; that may be just imprecise language. The details would probably be better left to another SE site.

Probably the 5GHz people meant to say "increases quadratically", assuming that as they increase frequency, voltage is also increased proportionally to level necessary for chip to function reliably. What is the equation relating frequency and required voltage level for a chip which operates stably at frequency F and voltage V? Note that for GPU such as AMD HD7850, I can set the clock frequency without changing the voltage supplied to the chip, so it is not automatically the case that changing the frequency implies a change in supplied chip voltage. Clearly at some point my chip will stop functioning properly, so it would be helpful to have an equation showing how much to increase voltage as a function of frequency. Also note that some GPU users undervolt the chip, why would they do this?

Under the assumption of quadratic heat rise in frequency/voltage, it would be more efficient to have 5 processors running 1GHz rather than 1 processor running at 5GHz, is this correct? I.e. space is quadratically cheaper than time, is that more or less correct? How to correctly state this tradeoff?

In this response to a question on How Modern Overclocking Works, @Turbo J says

You can increase the clock frequency further when the voltage is higher - but at the price of massive additional generated heat. And the silicon will "wear out" faster, as bad things like Electromigration will increase too.

So again, the question is what is the equation that models "massive additional heat", is "massive" quadratic, i.e. are we still talking about the work equation above?

As you posted, it is a linear relationship to clock frequency in a simplified model. It has a stronger dependence on the voltage. People overuse 'exponentially'. — HL-SDK, Sep 05 '13 at 18:29
"That is, if I am driving an oscillating signal twice as fast, if the voltage is the same, I expect each peak voltage to be half as much, is that right?" No it is not right. — cksa361, Sep 05 '13 at 21:33
OK cksa361, please help me understand how much I need to raise voltage on an overclocked GPGPU in order to maintain signal levels. According to another poster, I needed to raise levels as a function of frequency increase. — user29254, Sep 05 '13 at 21:39
Regarding your comment: Back in the days we performed overclocking by trial and error. 1.Increase frequency until component becomes unstable. 2.Increase voltage a tiny bit and check if stability improved. If yes, repeat step 1, if no repeat step 2. Stop as soon as temperature becomes a problem or system is too unstable. — Rev, Sep 05 '13 at 22:03
For most micro controller families, you find a voltage VS frequency graph in the data sheet. But i guess you are talking about complex high end CPUs/GPUs, where heat and power consumption is highly dependent on the instructions that are executed. Its probably just not predictable enough to give a reliable voltage-frequency relationship. — Rev, Sep 05 '13 at 22:10
I am playing with GPGPU for litecoin mining. AMD Radeon GPGPUs come with control panels now so you can pick your own GPU and memory clock frequencies, undervolt or overvolt, and manually set fan speeds. It's part of the fun. I have an MSI Radeon 7850 which heats up to 95C at 100% load. I have an ASUS Radeon 7850 which is much better at staying cool at 100% load. I bought and then returned a liquid cooling card fan replacement after another guy told me that liquid cooling can leak onto motherboard, and claiming that correct air cooling should be adequate. — user29254, Sep 05 '13 at 22:23
I have noticed that Litecoiners will also undervolt GPU in hopes of reducing power consumption when they overclock. Net net is I started looking at extreme cooling and found the above article which made the erroneous claim that heat dissipation is exponential in overclocking. Quadratic is maybe correct if overvolting is necessary to maintain signal quality, which leads to question of just how much overvolting. Undervolting should fail in general. Litecoining is tolerant of computational faults because output can be checked and discarded by separate CPU. — user29254, Sep 05 '13 at 22:26
As you mention, undervolting is generally for people looking to reduce their power consumption. For miners, this translates to reduced costs; for other enthusiasts, this generally means an ability to use quieter cooling. Back in the day, I ran my Opteron undervolted *and* overclocked. — mng, Sep 05 '13 at 22:31
Which means that the digital signal has a ways to sink before 0/1 is no longer recognized, but still needs to be pulled up (overvolted) for more extreme overclocking. So there is a curve there, just searching for the relationship. Surprisingly, there are no obvious references for this which is why I am going to StackExchange. It appears that the miners are doing some leading edge hardware research by playing with these parameters. — user29254, Sep 05 '13 at 23:13
Oh wait @mng there is an obvious reference: [Dynamic voltage scaling](https://en.wikipedia.org/wiki/Dynamic_voltage_scaling) and [Overclocking](https://en.wikipedia.org/wiki/Overclocking) — user29254, Sep 05 '13 at 23:20
Due to differences in process technology, process variation, transistor type/size for a given process, and design specifics (pipelining), there is no hard-and-fast relationship for frequency vs voltage. Overclockers develop a "feel" for every device stepping as they hit the market. — mng, Sep 06 '13 at 00:13
Yes @mng but...for "extreme" overclocking of the 5GHz project (of 2003; 5GHz doesn't seem so extreme today), there was what they called an "exponential" increase in heat dissipation, which we are thinking is really quadratic due to overvolting, and someone claimed you had to overvolt when the frequency is high enough, so it seems like there is some physics there which is not as fragile from a modelling point of view as you suggest, i.e. up to some constants there should be a relatively simple equation giving P(f,V(f),Cg). — user29254, Sep 06 '13 at 00:41
What @mng has told you is true. It's not that the physics are "fragile" but that for modern processors the relationship is more complex. Leakage is another important source of power consumption that is not proportional to clock frequency, and only the manufacturer knows the relative contribution. Miners aren't doing "leading edge research", they are just collecting anecdotal data. — Joe Hass, Sep 06 '13 at 01:19
Hi @Joe Hass, in my work I am pitched by Intel and NVidia to use their GPGPUs in finance (option pricing; generally not that well suited). I accidentally found in crypto currency mining a problem which is well suited to GPGPU, whose practitioners purchase vastly more GPGPU hardware than Wall St, and where the "consumer" hardware is lapping "pro" hardware (e.g. Radeon vs Xeon Phi) which has a 10x higher price tag. Bitcoin miners have even moved on to ASIC and FPGA. So while they are not doing "research" in the usual sense, they are pragmatically ahead of anybody but NSA running qbits. — user29254, Sep 06 '13 at 01:37
Don't get too hung up on their use of the word 'exponential'. That article is full of hyperbole. — mng, Sep 06 '13 at 04:05
Hi @mng, I have an MSI Radeon HD7850 card that is poor at cooling (goes to 95C under 100% load quickly), an Asus Radeon HD7850 that cools much better (same conditions stays at 70C). I almost got an Arctic liquid cooler but was warned that liquid coolers leak and need fluid replacement every 6 months. I see racks like this: [AMD](http://arstechnica.com/security/2012/12/25-gpu-cluster-cracks-every-standard-windows-password-in-6-hours/) and [Xeon](http://www.colfax-intl.com/ms/xeonphi/images/CXP8000-int.gif): Tradeoff between overclock+cooling vs adding more units running slower is interesting. — user29254, Sep 06 '13 at 14:53

Vasiliy · Accepted Answer · 2013-09-07T18:43:15.177

In general:

Take whatever physical system to an extreme, and all the simple models which were developed by engineers will break apart.

Simple model for active power dissipation:

The statement about an exponential increase in heat dissipation at extreme overclocking is not consistent with the following equation:

$$P_g \propto C_gV^2f$$

but how the above equation was derived?

Well, it is based on the following simplification:

schematic

^{simulate this circuit – Schematic created using CircuitLab}

This model assumes that:

Transistors behave like an ideal, mutually exclusive switches (no overlap in time when both switches are ON)
All capacitances may be represented as a single equivalent capacitor at the output
No leakage currents
No inductances
More assumptions

Under the above assumptions, you can think of inverter's (or any other logic gate's) action as of charging the output capacitor to \$V_{dd}\$ (which consumes \$\frac{1}{2}C_{tot}V_{dd}^2\$ Watt from the power supply), and then discharging it to ground (which does not consume additional power). The frequency factor \$f\$ is added to represent an amount of such cycles per second.

In fact, it is surprising that the above equation may be an accurate estimation of dynamic power at all, given the huge amount of non-trivial assumptions made. And indeed, this result may be used for the first order analysis only - any serious discussion of power dissipation in modern CPUs can't rely on such a simplified model.

How the simple model breaks:

All the assumptions made while developing the above simplified model break at some point. However, the most delicate assumption which can't hold for an extreme frequencies is that of two mutually exclusive ideal switches.

The real inverter has non-ideal Voltage Transfer Curve (VTC) - a relation between inverter's input and output voltages:

enter image description here

On the above VTC the operational modes of both NMOS and PMOS were marked. We can see that during switching there will be time when both NMOS and PMOS are conducting at the same time. This means that not all the current drawn from the power supply will flow to "output capacitor" - part of the current will flow directly to ground, thus increasing the power consumption:

enter image description here

What this has to do with frequency:

When the frequency is relatively low, the switching time of the inverter comprises negligible part of the total operational time:

enter image description here

However, when the frequency is pushed to the limit, the inverter "switches continuously" - it is almost always in switching activity, thus dissipating a lot of power due to direct ground path for the current (time scale changed):

enter image description here

Maybe it is possible to try to model this and see if the result is exponential, but I prefer to use simulations (however, the simulation will account for all non-idealities, not just this one).

Simulation results:

In simulation I measured the total energy (integral of power) drawn from an ideal power supply by an inverter in the following configuration:

enter image description here

The first and the last inverters are there just in order to model a real driving and loading conditions.

The dissipated energy as a function of frequency:

enter image description here

We can see an approximately linear dependence for periods longer than 1ns, and clearly exponential dependence for shorter periods.

Notes:

For the simulation I used an antique 0.25um transistor models. The current state of the art transistors are more than x10 shorter - I guess the divergence from the linear model is stronger is newer technologies.
The question whether a particular CPU/GPU can be overclocked such that it enters the exponential frequency dependence state while still stable and functional is device specific. In fact, it is exactly what overclockers try to derive empirically - to what frequency can a given device be pushed without malfunctioning.
All the above results and discussions do not consider changing voltage levels. I guess there is no way to analytically predict the outcome of simultaneous change of both frequency and voltage - the only way to find out is to perform an experiment.

From a single inverter to CPU:

CPUs mainly consist of logic gates, which are conceptually similar to an inverter. However each modern CPU has sophisticated measures of controlling its operating frequency, operating voltage and can turn off its submodules during runtime. This means that the heat dissipation trend of the whole processor may be slightly different than this of the single inverter. I guess that the statement about exponential increase in heat dissipation during extreme overclocking is a bit of exaggeration, but we are not mathematicians: either it is exponential, or \$\propto f^{3+}\$ - it is all kind of "bad".

Hi @Vasiliy Zukanov, can you quantify the tradeoff between 5 CPUs running at 1GHz vs 1GPU running at 5GHZ, in Watts/cycle? Do 5 air cooled slow processors make more economic sense than 1 liquid nitrogen cooled fast processor? This is non-obvious in the sense that there is a market for liquid cooled machines (all Crays in their day) and liquid cooled cards, e.g. a new liquid-cooled double Radeon HD 7950 just came out: [ARES2-6GD5](http://www.asus.com/ROG_ROG/ARES26GD5/). Liquid cooling is used to push towards the dicey end of the curve — user29254, Sep 07 '13 at 14:48
Also here is a lab report on [ARES2-6GD5](http://www.tomshardware.com/reviews/rog-ares-ii-dual-gpu-review,3458.html). — user29254, Sep 07 '13 at 15:07
@user29254, too many factors involved: the architecture of CPUs and GPU, their microarchitecture, the fabrication technologies of both, SW architecture, the overhead of operating system, many more... I don't think anyone can give you an accurate comparison - even if you'll get both manufacturers sitting in the same room. The only thing is that GPUs usually have HW modules for special calculations (which may take many cycles in CPU) - try to see whether the algorithm you want to run will benefit from GPUs HW. If not, I'd say go for 5 CPUs. — Vasiliy, Sep 07 '13 at 15:27
Hi @Vasiliy Zukanov, the question is really about rack configurations and water cooling for GPGPU mining and password cracking. For example this rack uses air cooling and [MOSIX OS from Israel](http://www.mosix.org/txt_vcl.html/) and the rack designer says that water cooling is a waste of time: [AMD Cluster](http://arstechnica.com/security/2012/12/25-gpu-cluster-cracks-every-standard-windows-password-in-6-hours/) — user29254, Sep 07 '13 at 16:47
Also, the lab report on ARES2 says at peak load card reaches 625W which exceeds supplied 575W. My conclusion from all this is that bundling 2 GPUs into one watercooled card is a waste of time vs just plugging 2 cards into a properly powered motherboard. — user29254, Sep 07 '13 at 16:54
@user29254, I've never been a big fan of fancy stuff like overclocking and experimental configurations - I'm too lazy. The title of the question states that the question is about frequency dependence of heat dissipation. I suggest you'll ask another question with appropriate title in order to draw attention of people with the required knowledge. — Vasiliy, Sep 07 '13 at 18:31
I really appreciate your succinct, deep and clear answer to the original question. It completely satisfies my curiousity on this topic. I was just trying to give you some color on what the wider practical motivation for the question is. I was trying to dig deep on whether it was optimal or pessimal to overclock. Variables are space and wattage devoted to cooling versus less space but lower performance for air cooled. My sense is that the deep dive tells me that overclocking and water cooling are pessimal. I can add separate question for the bigger topic. — user29254, Sep 07 '13 at 19:15
How would things affected if e.g. VDD were reduced to 1.0? An inverter whose input was below 0.1 volts should be able to pull the output above 0.95, and one whose input was at 0.9 should be able to pull the output below 0.05, but no voltage level should turn on both high- and low-side drivers simultaneously. — supercat, Jan 10 '14 at 22:21
I could certainly see that metastability issues would take on increased significance in such a design (normally a latch will be constructed so the feedback loop has greater-than-unity gain just about everywhere, but without high-side/low-side overlap the loop gain would be zero near the mid-rail) but if a circuit only had to deal with well-conditioned inputs I would think eliminating the shoot-through could improve efficiency. — supercat, Jan 10 '14 at 22:24

Is CPU/GPGPU heat dissipation quadratic in clock frequency?

1 Answers1

Linked