(Source: embedded developer.)
First and foremost, you need a language that compiles to machine instructions, not some intermediate instructions (e.g., Python, Java). If your compiler builds to Java bytecodes, what runs the bytecodes?
The compiled to machine code requirement eliminates many, many high level languages.
Second, your language needs to compile to machine instructions of multiple platforms. OK, so Haskell can compile x86 and ARM. How about the other dozen CPUs out there?
C's linker gives me a crazy amount of power. I can put code in a specific location. Say my CPU starts up at 0x08000. I can tell the linker to store code at that location. I can tell the linker to store certain code in certain segments. The loader will put those segments in a specific location. That's useful if I have code I want to run out of flash vs code to run out of RAM. The Linux kernel puts startup code into a chunk of memory that is later freed, thus recovering memory from code that's only run once, ever.
I can call assembly from C and C from assembly. CPU startup is all in assembly. Low-low level stuff is done in assembly because the C environment might not be initialized yet! (Something has to call main().) The very low level multiple CPU synchronization constructs (semaphores, mutexes, etc) are usually in assembly because they require specific CPU instructions. Caches, MMUs, etc, are usually configured in fiddly bits of assembly. (ARM is loaded with specific instructions to configure cache, MMU.)
C is a thin layer about assembly. C is faster than assembly. So C wins for low level stuff. C wins because no one else has come up with a competitor.