JPEG

Introduction

The JPEG still image compression standard was specified by a committee called the "Joint Photographic Experts Group". (I would have thought that they would have the name "Joint Photographic Experts Committee" but maybe they thought JPEC didn't sound as good as JPEG.)
This group was set up way back in the eighties to develop a standard for encoding continuous grey scale and colour images. The JPEG ISO standard was more-or-less settled in 1991 with the aims of having 'state of the art' compression rates (circa 1991), to be useful for practically any kind of continuous tone image and be implemented on many CPU and hardware configurations. To achieve this, the JPEG standard allows a number of modes of operation:
Sequential encoding, where the image is built up spatially, top to bottom, left to right.
Progressive encoding, where a low quality image is progressively refined with higher detail.
Lossless encoding, for which a decompressed image is identical to the source.
Hierarchical encoding, having the image encoding at multiple resolutions.

Lossless Encoding

To dispense with the simpler scheme of encoding first, I'll cover lossless encoding now. This coding scheme is rarely used in practice. In fact I wouldn't at all be surprised if many JPEG decoders can't actually handle this method and I have yet to find a program which claims to encode using the lossless method.

Lossless encoding makes use of a relatively simple prediction and differencing method. This can give around 2:1 compression on source images between 2 and 16 bits per pixel. ie. 2 - 32768 levels of grey. If the image has multiple colour components (like RGB), then each component is encoded separately.

The encoding goes like this: source images are scanned sequentially left to right, top to bottom. The value of the current pixel is predicted from the values of the previous pixels. The difference between the predicted value and the actual value is what is encoded using Huffman or arithmetic methods.


There are 8 prediction schemes available to the encoder, although scheme 0 is reserved for the hierarchical progressive mode. Also, scheme 1 is always used for the very first scan line, and scheme 2 is used to find the first pixel on a new row. That's basically all there is to it.

Lossy Encoding

The lossy compression scheme is where all the action happens for JPEGs. This scheme makes use of the discrete cosine transform to convert images and compresses the resulting DCT coefficients (recall the module on Spatial and Spectral encoding). The lossy compression scheme can vary the level of compression ratio used, giving control over the final image quality (and file size, roughly). The amount of compression JPEG can supply depends almost entirely on the source image. Best results come from using images with few high frequency details.

The DCT is applied to 8x8 pixel blocks, this size being selected as a trade-off between computational complexity, compression speed and quality. Two methods are used in compression; the coefficients are quantised, and are then Huffman or arithmetically compressed. The quantising of the coefficients is where the lossy part of the sequence, where high frequency information is discarded.


Lossy Compression Steps:

1. Convert non-greyscale images into YCbCr components.
2. Downsample CbCr components.
3. Group pixels into 8x8 blocks for processing.
3. DCT each pixel block.
4. Unwrap the coefficients.
5. Scale each coefficient by a 'quantisation' factors.
6. Eliminate near-zero coefficients.
7. Huffman encode data.
8. Add header info and quantisation factors.