The exponential behavior is known as subthreshold conduction and is seen when the MOSFET's channel is in weak inversion. In this mode of operation, diffusion current dominates (while in strong inversion, the device operation is best explained with drift current).
From an analog design standpoint, operating in this weak inversion regime allows for highly power-efficient design. In particular, \$g_m / I_D\$ (one particular figure of merit used in some designs) is quite good, as is the output impedance of the MOSFET. This allows for high gains to be realized with low current consumption.
With wide transistors operating in this weak inversion scheme, it is also possible to achieve good gain-bandwidth products under the assumption that output capacitive loading contributes the dominant pole (since the frequency of that pole is roughly modeled by \$G_m / (2\pi C_L)\$. This assumption is often valid when doing VLSI design.
The tradeoff is the speed of the transistor itself, as \$f_t\$ is relatively poor in this mode of operation. Once the input signal reaches a frequency of \$f_t\$, the transistor can no longer perform any meaningful amplification: to get 1 nA of sinusoidal small-signal current through the drain, the gate capacitance must be driven with that same amount of current to actually modulate the channel. This creates issues for higher-frequency signals, since the high transconductance achievable with wide transistors begins to interact with the capacitances inherent to the transistor itself.
While I don't have exact numbers from any particular CMOS process which I am allowed to share, I can show this approximate chart that adequately conveys the rough behavior of these two figures of merit as the inversion coefficient changes.

Notice that as the transistor's channel is driven into weaker and weaker saturation, the transconductance efficiency tops out at some value, while the transit frequency drops further and further, into values as low as the kHz range in extreme situations.
In the processes that I worked with (mostly older mixed-signal CMOS, sub-micron but not anywhere near FinFET sizes), the typical values for gm/Id (transcondutance efficiency) are in the 20s to 30s in weak inversion. This comes from the fact that the weak inversion drain current in a MOSFET is like that of a BJT, but with a non-ideality factor \$n\$:
$$\begin{align}
I_{bjt} &= I_0 e^{\frac{V_\pi}{V_t}}\\
I_{fet} &= I_0 e^{\frac{V_\pi}{n V_t}} \,\,\text{(for weak inversion only)}
\end{align}$$
With some mathematical rearrangement, it can be shown that the transconductance efficiency in each case is \$\frac{1}{V_t}\$ and \$\frac{1}{nV_t}\$, respectively.
Ft can be over 10 GHz even in an older (larger than 100 nm) process.