Once a chip overheats it can start malfunctioning - for example many programs may start failing once some or all parts in a computer overheat.
What exactly happens that makes chips malfunction when they overheat?
Once a chip overheats it can start malfunctioning - for example many programs may start failing once some or all parts in a computer overheat.
What exactly happens that makes chips malfunction when they overheat?
To expand on other answers.
There are more reasons, but these make an important few.
The main problem with IC operation at high temperatures is the greatly increased leakage current of individual transistors. The leakage current can increase to such an extent that the switching voltage levels of the devices is affected, so that signals can't propagate properly within the chip, and it stops functioning. They usually recover when allowed to cool down, but that is not always the case.
Manufacturing processes for high-temperature operation (up to 300C) employ silicon-on-insulator CMOS technology because of the low leakage over a very wide temperature range.
Just one addition to some excellent answers: Technically it isn't the dopants that get more mobile it is an increase in intrinsic carrier concentration. If anything the dopants/carriers get less mobile as the silicon crystal lattice starts to "vibrate" due to the increase thermal energy making it harder for the electrons and holes to flow through the device - optical phonon scattering I believe phsyics calls it but I may be wrong.
When the intrinsic carrier concentration increases beyond the doping level you loose electrical control of the device. Intrinsic carriers are the ones that are there before we dope the silicon, the idea of semiconductors is that we add our own carriers in to generate pn junctions and the other interesting things that transistors do. Silicon tops out about 150degC so heat sinking RF and high speed processors is very important as 150degC is not too difficult achieve in practice. There is a direct link between intrinsic carrier concentration and the off leakage current of a device.
Like the other chaps have shown, this is just one of the reasons chips fail - it can even get down to something as simple as a wire bond getting too hot and popping off it's pad, there's a huge list of things.
Although leakage currents increase, I would expect a bigger issue for many MOS-based devices is that the amount of current passed through a MOS transistor in the "on" state will decrease as the device gets hot. For a device to operate correctly, a transistor which is switching a node must be able to charge or discharge any latent capacitance in that part of the circuit before anything else relies upon that node having been switched. Reducing the current-passing ability of transistors will reduce the rate at which they can charge or discharge nodes. If a transistor is unable to charge or discharge a node sufficiently before another part of the circuit relies upon that node having been switched, the circuit will malfunction.
Note that for NMOS devices, there was a design trade-off when sizing passive pull-up transistors; the bigger a passive pull-up, the more quickly the node could switch from low to high, but the more power would be wasted whenever the node was low. Many such devices were therefore operated somewhat near the edge of correct operation and heat-based malfunctions were (and for vintage electronics, remain) fairly common. For common CMOS electronics, such issues are generally less severe; I have no idea in practice the extent to which they play a part in things like multi-GHZ processors.
To complement existing answers, today's circuits are sensitive to the following two aging effects (not only these but they're the main ones on processes < 150nm):
Because temperature increases carriers mobility, it increases HCI and NBTI effects, but temperature is not the primary cause for NBTI and HCI:
These two silicon aging effects cause both reversible and irreversible damages to the transistors (by affecting/deteriorating the insulator substrates) which increase the transistor voltage threshold (Vt). As a result the part will require a higher voltage to maintain the same level of performance, which implies an increase in the operating temperature and, as said in other posts, an increased transistor gate leakage will follow.
To summarise, temperature will not really make the part age faster, it is higher frequency and voltage (i.e. overclocking) that will make a part age. But transistors aging will require higher operating voltage wich make the part heat more.
Corolary: the consequence of overclocking is an increase in temperature and required voltage.
The general reason ICs fail irreversibly is because the Aluminium metal inside them that is used to create interconnects between the various elements melts and opens or shorts devices.
Yes, leakage currents will increase, but generally it's not the leakage current itself that is a problem, but the heat that this causes, and the consequent damage to the metal inside the IC.
Power circuits (e.g. power supplies, high current drivers etc.) can get damaged because at high voltages, when the transistor drivers switch off quickly, internal currents are generated which cause latch up of the device, or uneven power distribution inside it which causes local heating and subsequent metal failure.
A large (1000's) number of repeated thermal cycles can cause failure because of mismatches between mechanical expansion of the IC and the package, eventually causing bond wires to be ripped off or delimitation of the plastic package material and subsequent mechanical failure.
Of course a large number of IC parametric specs are only specified over a given temperature range, and these may not be in spec outside this. Depending on the design, this can cause failure, or unacceptable parametric shift (while the IC is outside the temperature range) -- this can occur for extreme high or low temperatures.