A Machine Learning and Data Science Blog: JPEG Image Compression Explained

Topics Covered by title:

What is Compression?
Basic Need of Compression.
Types of Compression.
JPEG Encoder explained.

What is Compression?

Representing the data (Image, Audio, Video, Speech or Voice..) with the fewer number of bits than what it exactly requires to represent.

Basic Need of Compression
Effectively utilizing transmission bandwidth .Utilization of the storage media to the maximum.
Types of Compression:

Lossless Compressing the data which almost resembles the original Input data when decompressed.
Lossy Compressing the data with some lose of information(Keeping in mind of Human Visual and Psychoacoustic system) i.e. neglecting the higher frequency components which are very less sensitive to human visual system.

JPEG Image Compression standards.
In general image is nothing but a group of pixels. Pixel holds the brightness and color information of the image at a particular coordinate.
Red, Green and Blue are the primary color components of the color image.With the help of these color combinations we can get the color that we deserve.

Step by Step procedure in compressing input Image data.

Input: Reading MxN (In general 8x8) block of input RGB Image each color component is of 8-bit.
RBG->YcbCr: Converting MxN RBG samples to YCbCr (Luma and Chroma components).
DCT:Performing Discrete Cosine Transform (DCT) on each of the MxN Luma and Chroma components.
Scaling:Performing quantization of the resultant ouiput coefficient matrix given by DCT which of same size as input to this block.
Scanning:Zig-Zag scaning of the resultant MxN matrix to a single dimensional array.
Huffmancoding:Performing Huffman Coding on the resultant 1-D array.
BitStream:And finally the resultant bitstream will be the JPEG encoder output.

In detail:

Reading 8x8 matrix of Red, Green and Blue components of input image. Converting each one of the RGB components to YCbCr (Luma and Chroam components) using below equation

Y = (0.299R + 0.587G + 0.114B) + 64
Cb = (-0.1687R - 0.3313G + 0.5B) + 512
Cr = (0.5R - 0.4187G - 0.0813B) + 512

And then performing 2D - DCT on each of the 8x8 matrix of Y, Cb and Cr

ref the link below.

http://www.cs.cf.ac.uk/Dave/Multimedia/node231.html

DCT will give the lower frequency coefficients matrix 8x8 (Y,Cb and Cr).

which are more sensitive to human eye.

And then performing scalar quntization (scaling of resultant DCT output array) of each of the luma and chroma matrices.

This quantization is the major step in which the actual compression takes place and this is module which consumes more number of cycles in JPEG compression.

And performing zig-zag scanning on the resultant scaled arrays(Y, Cb and Cr).

Performing Huffman Run length coding on the resultant 1D array got after zig-zag coding.

The resultant array is the output bitstream of the JPEG encoder.

ANY QUESTIONS?

A Machine Learning and Data Science Blog

JPEG Image Compression Explained

No comments:

Post a Comment

Related Posts

Twitter Updates

Random Posts

Disclaimer

Recent Comments