All the answers provides insights on what is a stack and how multiple stacks are used, but not why it is useful.
Typical "big" CPUs that you can find in desktop computers use several stacks, usually one per privilege level, and with the memory management unit (MMU), the OS running at a higher privilege level can protect itself from misbehaving user processes. Having several stacks here simplify memory management with the MMU, because then the OS has its own stack which resides in its dedicated memory region (multi-core systems make the OS use multiple stacks, but it's more or less the same situation). Microcontrolers, on the other hand, don't have a MMU, only possibly a MPU (memory protection unit), and the protection they provide is usually very coarse, and microcontrolers don't usually run full-blown OSes with several processes, but a small RTOS with several threads, with limited protection. Can such an OS work without using a dedicated stack?
Of course it can. In fact, I wrote one simple preemptive RTOS running on PIC18 microcontrollers where the OS stack doesn't use a dedicated region of memory, and it worked well. The problem in doing so is that now every thread's stack have to be big enough not only to hold what is needed for their own computation, but also some more for interrupts and the OS, which means that now all threads must use bigger stacks that would normally be needed. In my case that wasn't too much of a problem, and not switching the stack back and forth saved a few cycles, but it was a memory overhead nonetheless.
Cortex-M cores, with their MSP and PSP stack registers, make the stack switching process seamless and automatic. With proper location of the Main Process Stack (used for the OS and interrupts), you can avoid having a thread be corrupted by an interrupt occurring in a different thread that overrun its stack.