Why does using Discrete Cosine Transform lead to data compression?

Question

DCT is used in JPEG standard along with Huffman encoding to further compress the result. I understand that most of the data in spatial domain which is an image is "present in low frequencies". This is because most adjacent pixels change only little in the color value.

An image which is represented numerically in signal processing, with numbers representing the pixel color intensity. When we do a DCT on it we still get numbers in result. Where exactly does data get compressed via DCT?

A related answer of mine, on Stack Overflow: http://stackoverflow.com/questions/10666583/why-is-the-gif-format-not-the-most-compact-format-for-natural-images/10666940#10666940 — Li-aung Yip, May 12 '15 at 06:23

score 2 · Answer 1 · answered May 11 '15 at 22:08

2

The DCT by itself does not really compress anything. All it does is convert the spatial domain into the frequency domain. This basically 'concentrates' the information in the image so that some of the high frequency information can be either discarded or stored at a lower resolution. This is where you actually get the space savings - throwing out data. Decompressing the image involves substituting 0 for the discarded data and then taking the inverse DCT to get back something similar to the original data.

answered May 11 '15 at 22:08

alex.forencich

40,694
1
68
109

Somebody said that the resulting numbers from DCT have less resolution, not less magnitude. In other words, they are taking away the least significant bits, not the most significant. What does this mean? Are you trying to say that numbers resulting from DCT from various 8x8 blocks have little difference in magnitude and this is used to later compress data? I don't see how and what precisely is being discarded without an example. – quantum231 May 11 '15 at 22:25
Both JPEG and MP3 are lossy compression, designed around the limits of human perception. A JPEG compressed image may have subtle visual `artifacts` not present in the original image, that result from the attenuated high frequency components lost during compression. – MarkU May 12 '15 at 00:13
Most of the 'information' in the image that we can see is at low spatial frequencies, which end up being a small portion of the outputs of the DCT. The majority of the outputs will represent higher spatial frequencies, and the idea is to use fewer bits to store this information, so the values are weighted and the less-important information is discarded. – alex.forencich May 12 '15 at 00:14
DCT performed with high precision arithmetic is reversible and should allow recreating the original image with no loss of resolution. But DCT performed as part of a compression system is likely to round to 8, 12, or 16 bits, and the rounding (not the DCT itself) reduces resolution. – May 12 '15 at 12:18

Why does using Discrete Cosine Transform lead to data compression?

1 Answers1