Steve Yegge wrote a great blog post that, somewhat indirectly, addresses this.
Big point #1: compilers encompass pretty much every aspect of computer science. They're an upper-level course because you need to know all the other things you learn in the computer science curriculum just to get started. Data structures, searching and sorting, asymptotic performance, graph coloring? It's all in there.
There's a reason Knuth has been working on his monumental (and never-ending) "Art of Computer Programming" for several decades, even though it started out as (just) a compiler textbook. In the same way that Carl Sagan said "If you wish to make an apple pie from scratch, you must first invent the universe", if you wish to write a compiler, you must first deal with nearly every aspect of computer science.
That means if the compiler is self-hosted, then it's pretty sure to be able to do what I need, no matter what I'm doing. Conversely, if you didn't write a compiler in your language, there's a good chance it misses something that's really important to somebody, because the language implementors never had to write a program that would require them to think about all those issues.
Big point #2: from 30,000 feet, a surprising number of problems look just like compilers.
Compilers take a stream of symbols, figure out their structure according to some domain-specific predefined rules, and transform them into another symbol stream. Sounds pretty general, doesn't it? Well, yeah.
Whether you're on the Visual C++ team or not, you will very often find yourself needing to do something that looks just like part of a compiler. I do it literally every day.
Unlike most other professions, programmers don't just use tools, but build their own tools. A programmer who can't (due to lack of skill, or lack of usable tools with which to build other tools) write tools will forever be handicapped, limited to the tools that somebody else provides.
If a language is "not well-suited to creating" programs that can take a stream of symbols, applying rules to them, and transforming it into another stream of symbols, that sounds like a pretty limited language, and not one that would be useful to me.
(Fortunately, I don't think there are many programming languages which are ill-suited to transforming symbols. C is probably among the worst such language in use today, yet C compilers are usually self-hosted, so that never stopped anyone.)
A third reason I'll end with, from personal experience, not mentioned by Yegge (because he wasn't writing about "why self-host"): it shakes out bugs. When you're writing a compiler, that means every time you build it (not just every time you run it), you depend on it to work, and to work correctly against a decent-sized codebase (the compiler itself).
This month I've been using a relatively new and famous non-self-hosted compiler (you can probably guess which one), and I can't go 2 days without segfaulting the thing. I wonder how much the designers actually had to use it.