From a more theoretical angle, frequency is the time derivative of phase. Equivalently, phase is the time integral of frequency. So, when a phase detector is used to control frequency via a VCO, there is an integration around the loop. Or, roughly speaking, a low-pass filtering effect.
As supercat points out, the advantage gained is the rejection of "warbling" or even glitches in the reference.
Many years ago, with a freshly minted BEE, I used a PLL to solve a problem where glitches on the backplane clock, due to, for example, hot plugging cards, (this was a digital loop carrier), caused a particularly sensitive card to "lock up", dropping any active call in progress. The PLL rejected the glitches, producing a stable clock for the line card, that, on average, was frequency locked to the backplane clock.