Since you are asking this question within the context of a PC and a modem, the answers I present are confined to the telephone domain.
You are correct in your explanation of sending the value "10" from your PC up to the point of the modem converting the 1's and 0's which make up the binary value 00001010. In general the modem actually is converting the 1's and 0's into two different audio tones. This is basically because the telephone system was designed to transmit and receive audio waveforms as a varying electric current. These two discrete values of audio tones ( two distinct frequencies) pass thru the local telephone system as a time-varying current. Once these signals are received at your local telephone company's central office ("CO") (i.e. the place where the telephone wire from your house connects to), they are generally converted to digital data right there and sent over the national trunk lines digitally. At the receiving end of the phone call the CO there reconverts these digital signals back to a time varying current for presentation to the copper telephone lines that run to the subscriber with the receiving modem.
The receiving modem recognizing these two specific audio tones (one tone is a "zero", the other is a "one") and converts them back to a binary string of 1's and 0's. Then, it's up to the PC connected to the receiving modem to convert these 0's and 1's back into 8-bit values.
So that to answer your question about what actually carries the data, it is really a multi-tiered mechanism. The modem converts the 0's and 1's to different time varying signals (the two tones, represented by an analogous time varying voltage) and then pushes these time varying signals thru the copper telephone wires to the CO as time varying currents. The modem converts the time varying signals to time varying currents because the connection to the CO is what is known as a "current loop". The local, copper wire telephone loop to your CO carries electrically-encoded audio signals as currents, not voltages. These electric currents flow very swiftly, so your "data" (which the time-varying current represents), flows very swiftly. Maybe not at the speed of light, but some significant fraction thereof depending on various conditions in the connecting lines.
You see? There are two mechanisms at play here: The binary data is represented as to audio frequency tones and the tones are transmitted in the form of electric currents. At least that's how it works between the modem and the telephone company's CO at both ends of the connection. In between the two participating CO's a whole other set of mechanisms come into play.
Also to correct your thinking, binary data is indeed commonly encoded as two voltage levels in electronic systems, but not always. Some systems encode data as frequencies, like the modem. Others encode data as the phase of a constant frequency signal. And there a few other methods as well.
And leave all that electric wave & E-field propagation stuff to the physicists. It will only confuse you when you are dealing with practical electronic equipment. In this world of EE it's all about voltages and currents. You don't have to understand the phenomena beyond these two parameters to understand much of what goes on in most common electronic devices.