8

I've always been under the impression that overclocking of any sort of CPU (for a PC, or a microcontroller), its a bad thing. Your'e essentially operating the unit outside of the manufacturer's specifications, which means that the device may/may not work as intended. It may go faster, but at the risk of erratic behavior.

I understand that the answer to overclock/not is a philosophical one depending upon how much risk you wish to take on with your devices. But,

What kind of permanent damage can be caused by overclocking a CPU?

(Note: I ask this because a number of my friends are gamerz and think overclocking roxors soxors... and for some odd reason after they do this their computers break with bluescreens and then I get called, and I want some ammunition to use so that I don't have to troubleshoot potentially flaky hardware so often...)

J. Polfer
  • 3,780
  • 2
  • 27
  • 33

3 Answers3

12

I've overclocked almost every computer (excluding laptops) i've ever owned purely for the cost savings and making matlab sims not take all day.

Overclocking, as in raising the clock speed or multiplier shouldn't damage modern CPU's. The thermal shutdown in the CPU should trigger early enough to prevent damage. Older CPU's didn't have as robust thermal protection.

If you're raising various voltages in an attempt to run even faster, you can inadvertently cause permanent damage to the CPU. It's good to stay within the max voltage specifications given by the CPU manufacturer.

Depending on your usage model, overclocking can cause reduced life span. This is really just a function of CPU temperature, the hotter it runs the shorter the life span. If the CPU is running right on the edge of its TDP rating 24/7 i wouldn't expect it to last for 10 years.

You generally are not running the device outside its design specifications as long as you stay within the specified voltage levels. As a design is fleshed out manufacturing yields get better and better and parts binned to 2.6GHz are very often capable of and tested for much higher speeds, they are just binned to the low end to fill the higher market demand for that segment.

Currently typing on a core i7 920 @ 4.1ghz with air cooling (granted it's one massive heatsink and 2 140mm fans). D0 stepping, a newer stepping which is capable of much higher speeds than the older steppings. I actually ran a 12hour prime95 test at 4.25ghz but anything higher started spewing errors and i didn't want to raise the supply voltages any more so I backed off a bit to get to 4.1 for some headroom. You also have to make allowances for ambient temperature changes if your space isn't air conditioned.

EDIT for sheepsimulator:

The effect on the ram depends on the architecture your talking about and the features offered by the motherboard.

For example in the core i7 architecture:

In the Core i7 architecture you have 1 base clock that generates the clocks for the CPU core, the 'uncore', the QPI and the RAM via 4 different multipliers.

In some CPU models these multipliers have limited ranges, but key to your question: when you overclock the system you normally crank the base clock up which does also increase the RAM clock. But, you can reduce the RAM clock multiplier to get stock or very close to stock ram speeds if you wish. The core i7 920 by default uses DDR3-1066 ram but DDR3-1600 is almost the same price so most people buy the faster ram and adjust the RAM multiplier to get to the 1600 rating. You also have control over the ram voltage on good motherboards so you have the options of over volting/clocking the ram should you so wish.

In some older architectures there was limited or no control over the RAM clock multiplier which could mean that you need faster ram to achieve a particular CPU clock.

Mark
  • 3,672
  • 22
  • 18
  • @Mark - doesn't overclocking also affect your RAM with certain parameter changes? I had a friend that overclocked his i5, and his slot0 on the motherboard became damaged and caused memtest errors. He changed the BCLOCK. – J. Polfer Aug 23 '10 at 13:37
  • @sheepsimulator - nice name :) and check my edit. In short, he could have wildly overclocked his ram if he just blindly cranked up BCLK without lowering the memory multiplier. RAM generally doesn't have much, if any, thermal protection so you have to pay attention to your memory clocks. – Mark Aug 23 '10 at 20:13
  • running your CPU at increased temperature will reduce the expected life of the CPU. Just as a side note, thought I would add it. I still find that in general my computer will be outdated before my CPU fails, so not a major risk. – Kortuk Jan 13 '11 at 22:45
  • Can anyone add more about "overclocking can cause reduced life span". Like empirical evidence? Statistics? – David Balažic Nov 24 '20 at 15:02
3

Mainly it's a thermal issue. Electromigration can break the chip due to too much current.

Nick T
  • 12,360
  • 2
  • 44
  • 71
Brian Carlton
  • 13,252
  • 5
  • 43
  • 64
2

This reminds me of a great little article entitled The Zen of Overclocking by Bob Colwell who was the chief IA-32 architect for the Intel Pentium Pro to Pentium 4 processors.

Unfortunately the document is not available to the general public, but should be available to IEEE Computer Society members, and many/most universities networks. It was originally published in Computer magazine, March 2004 (Vol. 37, No. 3) pp. 9-12.

A couple of brief quotes:


Abstract: Overclocking is a large, uncontrolled experiment in better-than-worst-case system operation.

... This issue of Computer [magazine issue] spotlights what I call "better-than-worst-case" design. With normal worst-case design, any computing system is a conglomeration of components, operating within frequencies, power supply voltages, and temperature ranges that were set to simultaneously accommodate worst-case values of every single component. (Modern CPUs don't really do it quite this way anymore, but they once did, and it's easiest to think of worst-case design this way.) ...

...Compare the seat-of-the-pants, maybe-it-will-work approach of the overclockers to the engineering challenge confronting Intel and AMD. First, note that this challenge isn't just the flip side of the overclocker's coin. Chip manufacturers must design and produce tens or hundreds of millions of chips; overclockers only worry about one. Manufacturers must set a quantifiable reliability goal, and no, it's not "zero failures, ever." That would be an unreachable—and not very productive—target because hitting it would require avoiding cosmic rays. Even at sea level, that would require more meters of concrete than any laptop buyer is going to find attractive. And even then, the concrete would only improve the odds. It would remain a statistical game. ...

Conclusion

If you don't floss your teeth, they won't necessarily rot away. The vast majority of car trips do not include any metal bending, so why wear seat belts? And why not smoke? Not all smokers get cancer. Or you could adopt Oscar London's compromise, "If you smoke, why bother wearing a seat belt?" And some rock musicians from the 1960s are still alive, so maybe all those drugs are really beneficial, acting as some kind of preservative. As for me, well, I'm an engineer, and I live in a statistical world. I'm going with the odds.


As to the specifics of whether over-clocking can cause permanent damage? Yes, in particular as the lithography technology improves at creating smaller scale dies (e.g. 35 nanometre) the thickness of the insulator / oxide is decreased as well. This means that this ever thinner barrier could fail due to high voltage or deterioration. So the related margin for acceptable error is decreasing (or the margin for failure is rising).

I believe MOSFET transistors are still used for CPU design, so looking at some of the difficulties with MOSFET size reduction may highlight other potential issues that overclocking may cause. At the system-level, overclocking may also cause internal / cross-channel EMI / RFI within the CPU die or any of the other subsystems (e.g. RAM bus), and may reduce the Signal-Noise-Ratio (SNR) such that mechanical or external EMI/RFI are no longer tolerable, and end up producing random errors on the digital buses.

And for the record I have damaged processors due to stupid over-clocking, and poor thermal dissipation. So beyond the theory, it is actually possible.

mctylr
  • 1,570
  • 9
  • 12
  • There are really 2 versions of overclocking, the first being running the device faster than it was ever designed to run. The second being running a part that was binned at a lower speed grade that its family can run. The former certainly has much more danger involved, the latter, especially in later steppings of the die, is really quite safe. I get the impression the linked article was discussing primarily the former. – Mark Jan 14 '11 at 02:46
  • @Mark, my (incomplete) understanding is that binning (small die, graded for different speeds) is based on statistical analysis of yields, not purely market economics (cost vs. supply). You would need to compare sunk (NRE) costs to material costs per unit to give you a clue to whether binning was being used to maximize profit. – mctylr Jan 14 '11 at 20:14
  • ...same die, graded for different speeds... – mctylr Jan 14 '11 at 20:41
  • To clarify perhaps, the failure rate for a individual die is not the same across the surface of a wafer, but I believe in most cases it is lowest at the center, and increases for dies at the outer edge of the wafer. So binning of seemingly identical die cores, is done due to probability of failure, which can be mitigated through speed reductions, or potentially disabling a subset of features (i.e. if L2 cache or FPU is common spots for failure to appear due to density or parallel-ness), a binned version can have the feature disabled, reducing chance of failure for these outermost dies. – mctylr Jan 16 '11 at 02:08