It's not the definition of phase margin. It's the way that "gain" in a control loop is defined. There's two systems of thought jostling together in the world of control theory, and neither one has gained preeminence over the other. So you have to pay attention to who's talking and you have to figure out which system of thought is in use (usually from the context - you don't always just get told).
Control loops are often defined as having a summing junction which inverts the feedback signal. I.e., the error signal contains a term which is the negative of the output signal, or the negative of some signal derived from the output:
$$e(s) = r(s) - F(s) y(s).$$
Here's a typical block diagram (from Wikimedia Commons):
This causes all sorts of confusion.
You can define the loop gain as the gain all the way around the loop, where you break the loop at any point (i.e., between the \$C(s)\$ and the \$G(s)\$ blocks, feed a signal into the following block (\$G(s)\$ in this example), and take the output of the preceding block (\$C(s)\$ in this example).
In that case, then if there is some frequency at which the loop gain is exactly one, the system will oscillate*. In this system of thought, the loop gain is
$$L(s) = C(s) G(s) P(s) \left( -F(s) \right), \tag 1$$
where the minus sign on \$F(s)\$ is because of the inverting character of the summing junction (note the minus sign on the feedback in the picture).
So here, the critical point you're looking for (and avoiding) is \$L(s) = 1\$. In terms of phasors, \$1 = 1 \angle 360^\circ\$: this is where the loop angle of \$360^\circ\$ comes from.
Or, you can define the loop itself as being from the output of the summing junction (\$e(s)\$ in the picture), to the input of the summing junction (the '-' input in the picture). In this case, the "loop gain" is defined as
$$L(s) = C(s) G(s) P(s) F(s), \tag 2$$
In this system, the loop will oscillate if the "loop gain" is exactly minus one -- i.e., the critical point is \$L(s) = -1\$. In terms of phasors, \$-1 = 1 \angle 180^\circ\$: this is where the "loop angle" of \$180^\circ\$ comes from.
Both are in use. Each is consistent, in its own way.
From the strictly theoretical standpoint, the loop gain formulation that takes the gain around the entire loop, including the summing junction (i.e., \$L(s) = 1\$ means oscillation) is probably more theoretically sound and definitely doesn't leave you scratching your head in one of those unusual cases where the summing junction doesn't have a minus sign.
From a practical working engineer standpoint, the loop gain formulation that defines loop gain excluding the summing junction (i.e., \$L(s) = -1\$ means oscillation) seems to be more popular with folks who actually design working control loops, and who slap their calculations in front of you for review.
You can argue endlessly about what's "right", but you won't get anywhere. I've given up on that long ago; I use the one that's most convenient for the problem at hand, and any time I'm looking at someone's Bode plot, I pay attention to which convention they're using.
* This is the Barkhausen criterion.