Would having 4 states per "bit" rather than 2 mean twice the storage space?

Question

Pretty straightforward fundamental, albeit naive, question:

Would having 4 states per "bit" rather than 2 mean twice the storage space? In case that isn't clear, I mean as in if every "storage structure", rather than only representing 2 values, (base 2 : 0, 1), could represent 4 values (base 4 : 0, 1, 2, 3).

I don't know that it means 2x the storage space as there are some costs with storing the multiple levels. — Erik Eidt, Oct 10 '17 at 02:49
See [Multi-Level Cell](https://en.wikipedia.org/wiki/Multi-level_cell), as used in much flash memory. They're now moving to Triple-Level Cell, or three bits of data per physical cell. — SomeoneSomewhereSupportsMonica, Oct 10 '17 at 05:54
If 1 bit could represent 4 states rather than two, 1 bit = 4 states, 2 bits = 16, 8 bits = 65536 states compared to 256. The growth is exponential. — Neil, Oct 10 '17 at 06:34
A "bit" is *defined* as having two states, so a storage cell with four states would store two bits per definition. — JacquesB, Oct 10 '17 at 06:42
@JacquesB while technically correct (the best kind) that's clearly not what the asker meant. — MetaFight, Oct 10 '17 at 08:34
If one "bit" with 4 states is as fat as two "bits" with two states each, the storage space is identical. — mouviciel, Oct 10 '17 at 09:03
This has been done, to increase the effective bit density of memories. I believe the technique was used (in some cases up to 8 states, I believe) in some varieties of commercial flash memories and related technologies. Don't know if it's still being done. (The space savings is mainly in the on-chip wiring, I believe.) — Daniel R Hicks, Oct 10 '17 at 12:34
@mouviciel So in saying you need two bits to equal one quad, are you agreeing with the premise that storage is doubled? — JimmyJames, Oct 10 '17 at 16:15
Make your bit large enough and you could store the entire universe in it. Oops, already happened. — , Oct 10 '17 at 18:12
Base2 is "bit", base3 is ["trit"](https://en.wikipedia.org/wiki/Ternary_numeral_system), so this question is about "quits" I guess, though [Wikipedia doesn't call them that](https://en.wikipedia.org/wiki/Quaternary_numeral_system). Though it does say byte->tryte->crumb, which is weird. — Mooing Duck, Oct 10 '17 at 23:31
@JimmyJames - It depends on what is compared: a memory device with _n_ bits has the same storage capacity as a memory device with _n/2_ quads. In my understanding, _storage space_ includes the physical volume of memory: depending on the technology, these two memory devices may or may not be the same size. — mouviciel, Oct 11 '17 at 07:59
@immibis and variables do not exist in nature. Computers are the most completely unnatural thing that I know of (besides income tax forms). — , Oct 12 '17 at 14:07
@immibis right I was just pointing out that most of what we deal with in life is not real, from bits all the way to the internet. It is all made-up concepts. People seem to forget this. — , Oct 12 '17 at 22:28

Cort Ammon · Accepted Answer · 2017-10-10T15:42:57.783

The word you are looking for is not "bit" but "symbol." "Symbol" is the word used to describe the process of mapping hardware signals (such as voltages or magnetic patterns) into logical bits. If a symbol may have 4 states, it can encode 2 bits worth of information.

Of course, we aren't saying anything about the resource usage of the symbol in that argument. If you're sending symbols along a wire as voltages, the different symbols look more and more similar as you increase the number of states per symbol. If I have a 0-5V wire, and 2 states per symbol (1 bit), my two states are 0V and 5V, with 5V between each symbol. If I have the same wire, but encode 4 states per symbol (2 bits), my states are 0V, 1.66V, 3.33V and 5V. That's 1.66V between each symbol. It's now easier for noise to corrupt my signal.

There is a law relating these, known as Shannon's Law which relates the bandwidth (in bits) to the rate of errors which occur due to noise on the line. It turns out that there's a limit to how many bits you can cram across a wire. Using more symbols leads to more errors, requiring more error correction.

We do use this technique in real life. Digital television uses QAM-64, with 64 states (and thus 6 bits per symbol). Ethernet uses 4 voltage levels, so 2 bits per symbol.

Edit: I used bit transmission rates rather than storage because it's more common to see symbols with more states in transmission, so I could make the story more clear. If one wishes to specifically look at storage and storage alone, one could look at Multi-Level Cells in flash memory, as Someone Somewhere mentioned in the comments. Such memory uses the exact same approach, storing 3 bits as 16 different charge levels of a capacitor. (or more!)

Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackexchange.com/rooms/66982/discussion-on-answer-by-cort-ammon-would-having-4-states-per-bit-rather-than-2). — maple_shaft, Oct 12 '17 at 12:33
this answer is completely wrong on account of Ethernet. See [here](http://www.iebmedia.com/images/art_images/IEB61_p32_1.jpg) - usual 100Base-T has MLT3 with 3 levels, and 1000Base-T has PAM5 with 5 levels, 10GBase-T has PAM16 with 16 levels. There ain't any version of Ethernet that has 4 levels that I know of or that I could find anywhere. @CortAmmon where on Wikipedia did you find that Ethernet has 4 voltage levels? I'd be more than happy to dig into it and verify where that comes from. — , Oct 13 '17 at 16:57

hyde · Answer 2 · 2017-10-11T13:48:54.393

21

One quarternary memory cell can store exactly as much information as 2 binary memory cells:

Quaternary Binary
0          00
1          01
2          10
3          11

So if you have same number of memory cells, but they are quarternary, then you have twice as much memory. But if this quad cell takes twice as much space on a chip, then there is no benefit.

Or another way, if you had 1 gigaquad of some quartenary storage, it could store as much information as 2 gigabits of normal binary memory, because each quad could be expressed with two bits.

In a way this whole line of though is just of academic interest. You can already think that memory chips store for example 2^32 state cells, because you can not fetch 1 bit from them, you always get a full word. And if in future someone came up a way to store that word in 4-state physical cells more efficiently than in 2-state cells, then that would be used, but it wouldn't be visible outside the memory chip, it would still handle full memory words only, which can have for example 2^32 different states.

edited Oct 11 '17 at 13:48

answered Oct 10 '17 at 11:09

hyde

3,744
4
25
35

1

"One quarternary memory cell can store exactly as much information as 2 binary memory cells" true but 2 base-4 numbers can hold four times as much as two base-2 values. – JimmyJames Oct 10 '17 at 15:43
1

@JimmyJames Four times as many possible states is not the same thing as four times as much storage. See the conversation under Richard Dunn's answer. – Sean Burton Oct 10 '17 at 16:44
2

For me, the obvious follow-up question to this answer is, "Well... *do* quad cells take twice as much space on chip?". – Daniel Wagner Oct 10 '17 at 16:51
@SeanBurton the 'amount of information' and 'storage' are not the same thing. – JimmyJames Oct 10 '17 at 16:51
5

Then I'll rephrase: four times as many possible states is not the same thing as four times as much information. – Sean Burton Oct 10 '17 at 16:52
1

@JimmyJames That is a contradiction. 2 quad cells equal 4 binary cels. So you are saying that 4 binary cells stores 4 times as much information as 2 binary cells, even though it is just 2x as many cells. IOW, you are wrong, at least if you calculate information as "bits" (or bytes, or terabytes). – hyde Oct 10 '17 at 18:48
@DanielWagner That depends on the technology used. But for most intents and purposes, you can consider modern computer memory to consist of words. On chip surface, words are made from individual bits, but also some per-word transistors. So longer word means less of these. But it is not 4 times as many states, it might for example be 2^32 as many states for 32 bit word size. – hyde Oct 10 '17 at 18:53
Mea culpa. It was a bad morning. – JimmyJames Oct 10 '17 at 19:48
This is the best answer. – Tony Ennis Oct 12 '17 at 03:44

score 9 · Answer 3 · answered Oct 10 '17 at 19:21

In basic theory, yes. In reality, no - because we don't actually store data in bits anyway (on HDDs). Cort Ammon covers the issues in data transmission very well. RAM, cache, and SSDs store data as bits, but HDDs are different due to the nature of their physical material and our efforts to pack more data onto them. Most data is still stored on HDD's, so I'll focus on those. I'll go well beyond the explanation that you'll find from most sources, but will try to cite sources where I can. These sources must be dug up from the ancient depths of the internet because it is - to a large extent - truly forgotten knowledge.

First, hard drives store information with magnetic fields on the surface of the drive platters. The drive head reads these by sensing the flux from the change in that field - this is far easier to measure than the actual direction and strength of the magnetic field. but if the field is 50 of the same segments in a row, it can't actually count that there were 50 - it read a flux spike when reading the first segment, then no flux for a while after that, and it cannot track time accurately enough to be certain that the field was unchanged for 50 segments.

So, the basic (oversimplified) model is to store a bit as a pair of magnetic fields. The first would always be a switch from the prior segment, and the second would be a flip to represent 1 or no flip to represent 0. So a 0 is FN (flip-null) and a 1 is FF (flip-flip). The drive's timing is accurate enough to recognize the difference between one flux spike and two flux spikes within a segment. This format is called Frequency Modulation. So this gives clear signals, BUT it means that every bit of memory requires two spaces on the drive - that's very inefficient. So no hard drive actually had this most basic form of encoding; it used simple compression tricks instead. The simplest is Modified Frequency Modulation, which changes the pattern so that the extra magnetic flip is used only if a 0 is preceded by another 0. This lets the engineers cram nearly twice as much data into the same space, and thus was used on the first HDDs, and is the format on floppy disks. After that, a more advanced system called Run Length Limited was developed with a similar general idea, which I won't go into because it gets much more complicated and there are multiple implementations.

But we don't use any system like that today. Instead, we use a system called Partial Response, Maximum Likelihood (PRML). PRML requires the head to read a length and collect the magnetic sample, then it compares it to a reference set of stored samples to determine which one it matches best. It forgoes the entire concept of flux spikes, and instead uses pattern matching (I oversimplify, but the oversimplification is worth it), and the pattern corresponds to a set of bits. It uses noise filters and other tech to remove potential errors. It's best to think of it as a complex waveform, and the HDD knows how to translate each waveform into a set of bits. In this sense, the data is actually stored more in an analog format than a digital one, because the physical material can support the gradual variations of analog better than the sudden jumps of digital.

The best guide to this is at http://www.pcguide.com/ref/hdd/geom/data.htm (hit the Next button a few times to read all of it) and there are a few other sources - mostly from people who created massive repositories of computer knowledge that no one has any reason to know. A decent additional source (which is good but not quite 100% perfect as far as I can tell) is at http://www.tomshardware.com/reviews/hard-drive-magnetic-storage-hdd,3005-6.html

TL;DR: Hard drive disks don't store data in a format anything like 1's and 0's; they instead use complex signal processing to cram signals into the smallest space possible, and decode it when reading. So, they are really base-agnostic.

I wouldn't be surprised in base-4 storage was attempted on SSDs or RAM at some point. It all depends on the physics and chemistry of the materials. The engineers and scientists will push those materials as far as they can, and will pursue whatever route yields the best results.

Care to discuss a storage concept? If we were to store symbols based on a coordinate plane rather than sequentially,it seems to me like we could store extra bits based on the coordinate position, and position relative to other bits. https://chat.stackexchange.com/rooms/66911/vizs-discussion-2 — J.Todd, Oct 10 '17 at 21:08
Manchester Coding was developed for magnetic tape, and Phase Shift Keying for radio. Similar ideas to what you are saying. — , Oct 11 '17 at 12:06

score 6 · Answer 4 · answered Oct 10 '17 at 14:24

6

Yes having more states will allow each "cell" of storage or each symbol on a data transmission line to carry more information.

But there is no free lunch, we need to actually be able to distinguish those states. Turns out it's easy to build binary logic gates and much harder to build gates that distinguish, process and regenerate more than two logic levels.

And then there is the issue of attenuated signals. On a two level system you can simply design your threshold so that it works with worst-case attenutation, on a four state system where significant attenation is expected you need to adapt your thresholds to the particular attenuation of your system, not just to the worst-case attenuation. In practice that then means you need to add an attenuation measurement system to your communications system.

All that said there are situations where the extra complexity DOES make sense. A lot of SSDs now use more than two levels per flash cell (known as MLC or TLC), modern high speed communication protocols also nearly always use multi-level encodings.

answered Oct 10 '17 at 14:24

Peter Green

2,125
9
15

Ternary is not too hard. Computers have been built using that. – Oct 10 '17 at 18:17
1

Yep, ternary is easier than quarternary because you only have to distinguish "postive", "negative" and "off" rather than having to distinguish multiple levels of the same sign. Still harder than binary though. – Peter Green Oct 10 '17 at 18:22
2

The interesting thing about Morse code by radio is that the signal is either on or not. The not on condition is not information. So it is not the alternation of on and off that carries info, it is the length and spacing of the on pulses. No other modern representation system works this way that I know of. – Oct 10 '17 at 18:27
1

Bar codes? Bar and space separate the digits and width determines the value. – Sopuli Oct 10 '17 at 22:50
@Sopuli ok, so in that case, the dark part of the bar code does not reflect light, so it would be the "off" or "no signal" state. I guess my point was that encoding is not always simply two signal states, but could be signal vs no signal, which seems odd, except in real cases, like Morse code, bar codes, speech, etc. Computer representations do not usually waste space on storing regions of "no signal", they are more efficient than physical signal systems. In the physical case we are not at liberty to "fast forward" over the gaps in content, we must wait them out. – Oct 11 '17 at 12:01
@nocomprende "Signal" and "no signal" are two signal states – user253751 Oct 13 '17 at 00:17
@immibis not... Really. The Morse code example showed that gaps in time can be of arbitrary length, so you canna' tell if the transmission is still going on. Signal Fading is common, and an equipment failure or confusion by the operator could cause the message to stop. In Single Sideband voice, no carrier is sent when there is no modulation, so that is why people say "over" when they are done. With FM it is full carrier until the sender lets go of the transmit button, so you can tell when they are done sending. See the difference? Black bar code bars and having nothing to read are the same. – Oct 13 '17 at 12:09
@nocomprende Yes really, "no signal" is a signal state. – user253751 Oct 16 '17 at 04:00
@immibis sure, if you stare at TV snow long enough, you can learn everything there is to know. I used to listen to my homemade UHF transverter to find other Hams to talk with. I could hear their voices, way deep in the noise... Your computer is busy computing away on cosmic background radiation, whenever you power it off. – Oct 16 '17 at 15:37
@nocomprende Your username is relevant. – user253751 Oct 16 '17 at 21:32
@immibis if no-signal is a signal state, then there is an infinity of signals of all kinds around us continually. Hey, you just invented Quantum Mechanics! Let's see what you can learn from all these no-signals. When do you publish? Of course, everyone else already knows all of it also. What is the rest of this infinitely long comment saying? Perhaps all the secrets of the universe are encoded in it. But then, all comments are the same way. – Oct 17 '17 at 14:18
@nocomprende Now you are confusing "signal states" with "information". – user253751 Oct 17 '17 at 19:37

Damon · Answer 5 · 2017-10-13T11:59:05.107

If a bit had 4 states instead of two in a symbol (bit), then yes you would have twice the amount of memory. This might or might not take twice as much space, depending on the technology used.

There is a real-life example that you have in front of your eyes every day: Ethernet (which is not memory, but it's similar insofar as it transmit data) you have, among others, the ordinary "fast ethernet" at 100 MBit 100BASE-TX, and you have 1GbE ethernet.

Clearly, 1GbE requires 10 times higher frequencies than 100 MBit (as 100 MBit requires 10 times higher frequency than 10 MBit), that's why you need more expensive cables, too. Obviously.

Oops... that isn't true at all.

100 MBit ethernet transmits over two cable pairs at 100 MHz whereas GbE transmits at 125 MHz over 4 cable pairs.

Wait, so GbE is really only 2 1/2 times faster than 100 Mbit ethernet? I only get 250 MBit/s out?

No, it also uses 5-PAM coding, which can encode 2.32 bits per pulse per cable pair, of which 2 bits are used as actual information, and the remainder makes the signal more resilient to noise. Thanks to those fractional bits, 1000BASE-T is able to drop 8B10B coding, too.

So you've doubled the number of wires, and slightly increased frequency, but you get 10 times more throughput!

Now if you thought this is sheer magic, look at how digital cable television works, and if you are still not convinced, look into ADSL, which uses 32768-QAM to encode 15 bits in one symbol.
Same old copper wire, same frequency band, 15 times more stuffs going through.

EDIT:
Another very obvious real life example which I completely forgot about (since it is just too obvious, apparently!) that you have in front of your eyes every day is: USB pendrives.
Those commonly use MLC flash memory. What's that? It is a type of memory cell that stores one of four different charge levels. That's the smallest unit that you can access on a hardware level. So you could say your "bits" indeed have 4 states (they don't, you really just get out two bits instead of one, and you can only read complete sectors from the device anyway... but you could arguably look at it that way).
Same number of cells, but double the memory. Cheaper, smaller, somewhat less reliable, but... first and foremost, cheaper.

It is about fully utilizing the bandwidth that is available. — , Oct 12 '17 at 14:03
100BASE-TX only uses one pair per direction whereas 1000BASE-T uses 4 pairs per direction (with echo cancelling). — user253751, Oct 13 '17 at 00:19

user1936 · Answer 6 · 2017-10-12T04:05:32.717

You may be interested to know that the Russians developed a chip that was ternary, instead of binary. That means that each symbol could have the values of -1, 0, or 1. So each physical gate could store "three" values, instead of "two".

Potential future applications

With the advent of mass-produced binary components for computers, ternary computers have diminished in significance. However, Donald Knuth argues that they will be brought back into development in the future to take advantage of ternary logic's elegance and efficiency.

As you start to suspect, there may be a more efficient way to implement a base numbering system. (Although this ability to express this more efficiently depends on our ability to physically manufacturing on material.) It turns out that the constant e, the base of natural log (~2.71828), has the best radix economy, followed by 3, then 2, then 4.

Radix economy is how much number you can represent versus how many symbols you need to take up to do it.

For instance, the mathematical number three is represented as 3 in base 10, but as 11 in base 2 (binary). Base 10 can express larger numbers with less symbols than binary can, but the symbol table of base 10 is 5x larger (0...9) than the symbol table of base 2 (0, 1). Comparing the expressive power to the size of the symbol set is called "radix economy" (radix being the number of the base, for example, 2 in binary, or "base 2"). The natural question that follows is, where do I want to be in terms of this tradeoff? What number should I adopt as the radix? Can I optimize the tradeoff between expressive power and size of symbol set?

If you look at the chart in the radix economy article in wikipedia, you can compare the economies of various bases. In our example, base 2 has a radix economy of 1.0615, while base 10 has an economy of 1.5977. The lower the number the better, so base 2 is more efficient than base 10.

Your question of base 4 has an efficiency of 1.0615, which is the same size as base 2 (or binary), so adopting it over base 2 only gets you the exact same size of storage per number, on average.

If you're wondering, then is there an ideal number to adopt as a base, this chart shows you that, it's not a whole number, but the mathematical constant e (~ 2.71828) which is the best, having an economy of 1.0. This means that it's efficient as possible. For any set of numbers, on average, base e will give you the best representation size of it, given its symbol table. It's the best "bang for your buck".

So, while you think your question is perhaps simple and basic, it is actually subtly complex, and a very worthwhile issue to consider when designing computers. If you could design an ideal discrete computer, using base 4 offers the same deal-- the same space for cost-- as binary (base 2); using base 3, or ternary, offers a better deal over binary (and the Russians did build a physical, working computer with base 3 representation in transistors); but ideally, you would use base e. I don't know if anyone's built a working physical computer with base e, but mathematically, it would offer a better deal of space over binary and ternary-- in fact, the best deal out of all the real numbers.

this doesn't seem to even attempt to address the question asked, would having 4 states per “bit” rather than 2 mean twice the storage space? See [answer] — gnat, Oct 11 '17 at 16:42
@gnat I think the concept of radix economy directly addresses how much data you get per symbol. Not only does it answer the case of 4, it answers the case of any number. It's the general solution. — user1936, Oct 11 '17 at 17:27
I twice checked the Wikipedia link hidden under "turns out" and frankly I still fail to see how it relates to storage space — gnat, Oct 11 '17 at 17:32
@gnat I've updated the answer. Hopefully at this point you see how it at least attempts to answer the question. — user1936, Oct 11 '17 at 19:28

candied_orange · Answer 7 · 2017-10-12T14:11:08.453

Would you believe I can encode the sum total of human knowledge with a single match?

If I encode a bit in a single match the symbols might look like this:

With enough matches I can say anything. But I can say twice as much with the same match if I add two more symbols. Which might look like this:

Twice as much info with the same match! Well why not? Well why stop? Rotate every symbol 45 degrees and we double again. 30, 15, on and on. Soon I have enough symbols that I can say anything and everything with only one match! Once I do that we have a problem though. What does this match say?

How can you be sure exactly which symbol that is now? How much time do you need to be sure? That's the rub. The more symbols I add the more effort it takes you to tell them apart.

Would having 4 states per “bit” rather than 2 mean twice the storage space?

If we're talking about per match then sure. But, even if that didn't slow down our match read speed, now we're taking up more of my kitchen counter space. It's always something.

This is essentially how radio modulation schemes like Quadrature AM and Phase Modulation work. If you want some real fun, study how rotating phase vectors represent two simultaneous tones in Single Sideband or FM. — , Oct 12 '17 at 14:01

score -5 · Answer 8 · edited Oct 10 '17 at 18:04

-5

Having 4 symbols per digit instead of two means that you can store twice as much information in a single digit. However, as you increase the amount of digits, you can store exponentially more information:

Any n digits in base 2 can encode 2^n states whereas base 4 can encode 4^n.

edited Oct 10 '17 at 18:04

donjuedo

123
3

answered Oct 10 '17 at 08:36

marstato

4,538
2
15
30

You don't understand at all what is meant with the term symbol in this context. – Pieter B Oct 10 '17 at 09:01
@PieterB Could you enlighten me? – marstato Oct 10 '17 at 09:18
6

your statement is: "4 symbols/bit." That shows a lack of understanding. It's 4 states/symbol and with that 4 states/symbol it would encode 2 bits. – Pieter B Oct 10 '17 at 09:24
@PieterB I'm quite sure he's just using terms from CS instead of electrical/communications engineering. In CS "Symbol" usually refers to symbols in formal languages. Like in base 16, a single digit is usually expressed with the symbols from 0 to F. I'd say this approach of explaining it is a bit to high level abstract to really answer the question since it doesn't consider how the "digits" are stored but it's not wrong. – kapex Oct 10 '17 at 10:46
4

@Kapep it is wrong. He is using "amount of information" and "can encode N states" interchangeably which is absolutely not correct. Information is measured in bits, not number of states. Doubling bits per symbol adds as much information as doubling the number of symbols. – user5226582 Oct 10 '17 at 12:22
3

You should edit this to clarify that when you say "4 symbols per digit", you mean that each digit-place has 4 possible values (or states or "symbols"). When I see the phrase "4 symbols per digit", the first thing I think is that writing down one digit entails writing down 4 symbols. While you're at it, your answer contains 9 incorrectly capitalized words that you can fix... – Tanner Swett Oct 10 '17 at 13:33
2

@TannerSwett Since we're in pedantic land, 'digit' implies 10 values which is why it's kind of funny that we typically call technologies that use binary 'digital'. – JimmyJames Oct 10 '17 at 15:54
2

`(4^n) / (2^n) = 2^n` which means you can represents exponentially (`2^n` times) more states but that only represents twice (`log2(4^n) / log2(2^n) = 2n / n = 2`) more storage. Remember that `storage capacity in bits = log2(number of states)` – zakinster Oct 10 '17 at 16:27
2

@JimmyJames but 'digit' came from the word for finger, because extending or retracting a finger is a representation of a bit. (Counting on your fingers does not use bits though, although it should: we could count over a thousand that way. They've been teaching kids math wrong all these years.) – Oct 10 '17 at 18:21
@nocomprende, to you I say: 645. ;) (Kidding.) Oh, and if you can *comfortably* show 9 in that system I'd be impressed. – Wildcard Oct 12 '17 at 00:28

Would having 4 states per "bit" rather than 2 mean twice the storage space?

8 Answers8