I hear of circuts using a "carrier frequency" and modulating it and it
seems as if the speed is dependent on this frequency (for example
ethernet is several hundred MHz). What I mean is why can't one skip
all the modulation business and have a circut where they simply flip
the voltage high\low super super fast and achieve speeds of 1 THz (= 1
Tb/sec)
The quote above is a comment from the OP but I think it gets closer to the misunderstanding than what he puts in the question.
Pure data transmissions (1s and 0s) don't make efficient use of bandwidth. That's not a problem - the simplicity of transmitting 1 and 0 makes reception of those bits very easy. Between 5V and 0V (5V logic signalling) noise and glitches can come long and make the signal somewhat "different" to the original 0V and 5V but providing there is still a distinction to be made by the receiver, data can be faithfully recovered.
The trouble with normal data is that it can't share a line with another normal data system. The two lots of 1s and 0s end up being additive and sometimes you'll get 0V, sometimes you'll get 5V and sometimes you'll get 10V i.e. rubbish.
A carrier wave, when (say 1kbps) data is applied to it has a bandwidth that is a few thousand Hertz centred on a frequency that might be 20MHz. Below (say) 19.97MHz and above 20.03MHz the side frequencies can be largely removed and not transmitted AND importantly the "modulated" carrier can be received and turned back into the original data.
You could have another data system whose carrier is at 19.9MHz - if modulated with different data (still 1kbps for this example), the useful bandwidth it might occupy is from 19.87MHz to 19.93MHz. The receiver tuned into this carrier won't be interfered with by the transmission at 20MHz.
You could repeat this system and have a multitude of exclusive data systems all sharing the same wire (different carriers of course) and all the data streams are perfectly recoverable by their own receivers.
This is why modulation schemes of one sort or another are used. This one is called frequency division multiplexing. It can use a wire or radio.
It doesn't stop there - you can have time division multiplexing - this type of system allocates a time slot for each data stream. Let's say there are ten data streams each at 1kbps. If the data was sent ten times as fast it only needs to occupy 10% of the usable "capacity" of the cable. Ten systems, each with their own time slot can share one cable.
There are a few other schemes as well but this is beyond the scope of the question.