It's all due to inductance:
Say your microcontroller draws supply current which ramps up from 1mA to 11mA in 5ns then back to 1mA every time it processes an instruction.
di/dt = 10mA/5ns = 2 000 000 A/s
Now, the voltage across an inductor is v = L di/dt and the trace from the power supply to the microcontroller has, let's say 50nH inductance...
v = L di/dt = 100mV drop on the supply.
OK, it doesn't crash yet, because it's a slow micro, doesn't use lots of current... but a faster micro, or other chip drawing faster/higher current spikes needs to have its power come from a low inductance source to avoid voltage sag when it draws current pulses, and a capacitor placed close is a good way to achieve that.
Just as important is the fact the capacitor keeps the noisy current drawn by your micro in a small local loop.
Loop antenna efficiency is proportional to area, thus amount of radiated noise will be much less when the capacitor is close.
Also if you have other components, say an opamp on the same supply, then the capacitor at the micro will prevent the micro's noise from screwing up the opamps' supply, which tends to cause some garbage at the output...
So here you have it, the caps do:
- power integrity: caps serve high di/dt supply current locally
- EMI: reduce loop antenna area
- EMC: keep the noise out of the other sensitive devices
Now, how to choose the value:
- A roll of 100x 25V 0805 X7R costs €1.40 for 100nF and €5.40 for 1µF. So, buy a roll of 100 of 1µF.
- Every time you got to put a decoupling capacitor on your circuit, remember if you spend 10 minutes to read the datasheet and you discover 100nF will work, well you just lost 10 minutes and saved 4 cents if you only build one unit...
- I just put in 1µF, guaranteed to work every time. Also it has less ringing, works better with lowish-ESR electrolytics, etc...
- Also I use 25V caps so I only have to stock one value for 3.3V to 15V...