Any time you care deeply about performance, you generally want to get as close to the metal as you can. In most languages, you can write out performance critical segments in C code. C programmers can drop down to assembly language for the really critical stuff. So if I'm writing some C# code, but I really need a tight performance on an inner loop, I can write some C or C++ code and use interop to call that code. If I need even more performance, I can write assembly in my C library. Going lower than assembly is possible, but who wants to write machine code these days?
However, and this is the big consideration, dropping close to the metal is only high-performance for small, tight goals. If I was writing a 3D renderer, I might do the floating point math and rendering in C, (using a library to execute it on the video card.) But performance problems are also architectural, and performance issues from large-scale problems are often better solved in a high level language.
Look at Erlang: Ericsson needed a language to do massive parallel work easily, because doing parallel processing was going to get them way more performance than any tightly optimized C routines running on one CPU core. Likewise, having the fastest code running in your loop is only performance enhancing if you can't remove the loop entirely by doing something better at the high level.
You can do huge system, high level programming in C, but sometimes the greater expressiveness of a more powerful language will show opportunities for architectural optimizations that wouldn't be obvious otherwise.