You are confusing the voltage levels (RS-232 we assume) and the serial protocol (lets just say uart as it came from a time before everyone was obsessed with naming things).
Using your example you cannot tell 0110 from 001100 for example. With the uart protocol you need to sample ideally the middle of the bit cell, in order for the receiver to know where the middle is, being on different time sources and not exactly the same the first edge from idle gives you a reference to be able to hit the middle of the next N bits how big N is depends on the accuracy of each side, 8 or so data bits you can be pretty sloppy, and if you want can re-sync on any edges you do find (in your example how would you receive 00000000? or 11111111?)
The start bit gives us an edge do distinguish from idle, to tell is when the message starts and a reference to sample the bits. The stop bit insures we go back to idle at least for one bit cell or two. When saturated with data, no gaps, no idle other than the stop bit, then you have another problem which the uart protocol doesnt solve necessarily (well parity helps) if you come in in the middle (someone plugs then thing in while data is moving, or any other reason) the start and stop bits help to frame the data without parity you may still be able to figure out where you are, with parity you have an even better chance but not perfect.
Now there are other protocols. Many other protocols. go look up irig-106, instead of a start bit you have a sync pattern which can be followed by hundreds of bits before another pattern, no dead periods. The irig document has a nice chart of various encodings where NRZ-L is what we are used to with a simple uart (non return to zero level) an interesting one is bi-phase-l Where there is a state change mid bit cell so your 0110 would be transmitted at 2x the frequency of the data and would be 01101001, worst case you can never have more than two half bit cells in a row at the same level, many edges with which to bitsync.
Another interesting one is mil-std-1553, where they use bi-phase-L (which is a popular encoding with many different names just biphase or manchester, etc) but it is not continuous data, it is a burst of one to many words. they use an intentional biphase-l error of three half bit cells and three half bit cells as the sync pattern then go into the message encoded in biphase-l.
No reason why you couldnt use RS-232, RS-422, etc voltage levels with a different than uart protocol. But you still need edges every so often in the data in order to synchronize the clocks (if you carry the clock along then then that is another story) and you need some way to know determine where the groups of bits that make bytes or words are so you have to have a sync pattern or start bit or other. or do something like spi or i2c to mark the beginning. classic ethernet used a long square wave with some bits to indicate the end of that and the start of the packet. MDIO has something similar.
At the end of the day you cannot have a reliable single signal serial protocol without some way to know where word/message boundaries are in the bitstream, likewise cannot do it without knowing where/when to sample for each bit. Even if a continuous bitstream and perhaps you think you knew when time zero was and you can just count to 8 and mark off another byte, you might get lucky but you have to still sync with the sending clock as your clock is based on a different reference and will drift relative to the senders clock. So you can try to pull this off so long as you periodically look at th edges you find, and insure that there is an edge every N bits based on the math related to the accuracy of the clocks.