A Machine Learning and Data Science Blog: August 2009

Introduction to image registration

Tutorial on image registration
Image registration (IR) is a process of overlaying images (two or more) of the same scene taken at different times from different viewpoints, and/or by different sensors. The registration geometrically aligns two images.

Approaches:
The reviewed approaches are classified according to their nature
1. Area based
2. Feature based

Image registration procedures:

Feature detection
Feature matching
Mapping function design
Image transformation and re-sampling

Typically IR is required in multi spectral classification, environmental monitoring, change detection, image mosaing, weather forecasting,, creating super-resolution images, geographical information systems (GIS) etc (basically capturing image with remote sensing).
In this post we will look into following

Various aspects and problem of image registration.
Both, area based and feature based approaches to feature selection.
Review of existing algorithm for feature matching.
Methods for mapping function design.

Image registration methodology
IR widely used in remote sensing, medical imaging, computer vision. In general its application can be divided into four main groups.

Different viewpoints – (multi view analysis):
Images of same scene are acquired from different viewpoints. The aim is to gain larger 2D view or a 3D representation of the scanned scene.

Different times – (multi temporal analysis)
Images of same scene are acquired from different times often on regular basis, possibly under different condition. The aim is to find and evaluate changes in the scene which appeared between the consecutive image acquisitions

Different sensors – (multi modal analysis)
Images of same scene are acquired from different sensors. The aim is to integrate the information obtained from different source streams to gain more complex and detailed scene representation.

Scene to model registration
Images of a scene and a model of the scene are registered. The model can be computer representation of the scene (for instance maps).

Registration methods

Majority of registration method consists of following four steps.

Feature detection
Feature matching
Transform model estimation
Image re-sampling

Feature detection:
Salient and distinctive objects like closed boundary regions, edges, contours, line intersections, corners etc are manually or preferably automatically detected. For further processing these features can be represented by their point representative’s like center of gravity, line endings, distinctive points, which are called control points (CPs) in the literature.

Feature matching:
In this step, the correspondence between the features detected in the sensed images and those detected in reference images is established. Various feature descriptors and similarity measures along with spatial relationships among the features are sued for that purpose.

Transform model estimations:
The type and parameters of the so-called mapping functions, aligning the sensed image with the reference image, are estimated. The parameters of the mapping functions are computed by means of established feature correspondence.

Image re-sampling and transformation:
The scene image is transformed by means of mapping functions. Image values in non-integer coordinates are computed by the appropriate interpolation technique.

Implementation:
The implementation of each registration step has its typical problems
First, we have to decide what kind of features is appropriate for the given task. The features should be distinctive objects, which are frequently spread over the images and which are easily detectable. The detection should have good localization accuracy and should not be sensitive to the assumed image degradation.

Mapping function should be chosen according to the prior known information about the acquisition process and expected image degradation.

Feature detection: Feature based methods

Region features – Closed boundary regions of appropriate size
-The regions are often represented by their center of gravity which is invariant w.r.t rotation, scaling and skewing and stable under random noise and gray level variation.
- Region features are detected by means of segmentation methods. The accuracy of segmentation can significantly influence the resulting registration.

Line features:
The LF can be representations of general line segments, object contours, costal lines, and roads.
Standard edge detection methods, like canny detector or a detector based on the laplacian of Gaussian are employed for the line feature detection.

Point features:
The point features group consists of methods working with line intersection, road crossings centroids of water regions..etc high variances points, local curvature discontinuous detected using gobar wavelet, inflection points of curves. The core algorithms of feature detectors in most cases follow the definitions of the ‘point’ as line intersection, centroids closed-boundary region or local modules maxima of the wavelet transform.

Feature based methods is recommended if the images contain enough distinctive and easily detectable objects.

Feature matching:
The detected features in the reference and sensed images can be matched by means of the image intensity values in their close neighborhood the feature special distribution or the feature symbolic description. Some methods, while looking for the feature correspondence, simultaneously estimate the parameters of mapping function and merge the feature matching and transform mode estimation registration steps.

Area based methods:
Some time called correlation-like methods or template matching. These methods deal with the images without attempting to detect salient objects. Classical area based methods like cross correlation exploit for matching directly image intensity, without any structural analysis consequently they are sensitive to the intensity changes introduced for instance by none varying illumination.

Correlation like methods
The classical representative of the area based method is normalized CC and its modification.

Fourier methods:
If an acceleration of the computation speed is needed or if the images were acquired under varying conditions or they are corrupted by frequency dependent noise, then Fourier methods is preferred rather than the correlation like methods. They exploit representation of the image in the frequency domain. The phase correlation methods based on the Fourier shift theorem and were originally proposed for the registration of translated images.

It computed the cross power spectrum of the sense and reference image and looks for the location of the peak in its inverse. The method shows strong robustness against the correlated and frequency dependent noise and non-uniform time varying illumination disturbance.

Mutual information methods:
The MI methods are the last group of the area based methods to be reviewed. They have appeared recently and represent the leading tech in multimodal registration. Registration of multimodal images is the difficult tasks, but often necessary to solve, especially in medical imaging. The MI, originating from the information theory, is a measure of statistical dependency between two data sets and it is particularly suitable for registration of images from different modalisation. MI was maximized using the gradient descent optimization method.

Feature based methods:
We assume that two sets of features in the reference and sensed images represented by the CP’s have been detected.
Methods using spatial registration:
Points method based primarily on the spatial relation among the features are usually applied if detected features are ambiguous or if their neighborhood are locally distorted.
Few more methods: Methods using invariant descriptions, relax methods, pyramid and wavelets.

Transform model estimation:
After the feature correspondence has been established the mapping function is constructed . It should transform the sensed image to overlay it over the reference one. The correspondence of the CP’s from the sensed and reference image together with the fact that the corresponding CP Paris should be as close as possible after the sense image transformation are employed in the mapping function design.

Mapping methods:

Global Mapping model
Local mapping models
Elastic registration

Image file formats and file hedaers

In this post we will discuss about various file formats used for representing Image data. Firstly, we will start with basic understanding of types of file formats and brief details about each format. Later on we will discuss the file header details on each of the image file formats.
Image File formats
1. BMP
2. GIF
3. PNG
4. TIFF
5. JPEG
What each format stands for and its application

BMP:

BMP is standard windows image format on DOS and windows compatible systems. BMP format supports RGB, Indexed Color, Grayscale, and bitmap color modes.BMP images are generally written bottom to top, however, you can select the Flip Row Order option to write them from top to bottom

GIF:

Graphics Interchange Format (GIF) is the file format commonly used to display indexed-color graphics and images in hypertext mark-up language (HTML) documents over the World Wide Web and other online services. GIF is an LZW-compressed format designed to minimize file size and electronic transfer time. GIF format preserves transparency in indexed-color images; however, it does not support alpha channels.

PNG:

Developed as a patent-free alternative to GIF, Portable Network Graphics (PNG) format is used for lossless compression and for display of images on the World Wide Web. Unlike GIF, PNG supports 24-bit images and produces background transparency without jagged edges; however, some Web browsers do not support PNG images. PNG format supports RGB, indexed-color, gray scale, and Bitmap-mode images without alpha channels. PNG preserves transparency in gray scale and RGB images.

TIFF:

Tagged-Image File Format (TIFF) is used to exchange files between applications and computer platforms. TIFF is a flexible bitmap image format supported by virtually all paint, image-editing, and page-layout applications. Also, virtually all desktop scanners can produce TIFF images. TIFF format supports CMYK, RGB, Lab, indexed-color, and gray scale images with alpha channels and Bitmap-mode images without alpha channels. Photoshop can save layers in a TIFF file; however, if you open the file in another application, only the flattened image is visible. Photoshop can also save annotations, transparency, and multiresolution pyramid data in TIFF format.

JPEG:

Joint Photographic Experts Group (JPEG) format is commonly used to display photographs and other continuous-tone images in hypertext markup language (HTML) documents over the World Wide Web and other online services. JPEG format supports CMYK, RGB, and Grayscale color modes, and does not support alpha channels. Unlike GIF format, JPEG retains all color information in an RGB image but compresses file size by selectively discarding data. A JPEG image is automatically decompressed when opened. A higher level of compression results in lower image quality, and a lower level of compression results in better image quality. In most cases, the Maximum quality option produces a result indistinguishable from the original.

Day2 we will discuss about each of the file headers discussed above

Fixed math - Fixed point multiplication

Fixed point Multiplication:

For fixed point multiplication, two operands need not be in same Q-format.

QIResult = QIA + QIB.
QFResult = QFA + QFB.

For example if two operands are in same format (QI) the resultant value after multiplication will be 2 X QI (twice of QI).

Consider two fixed point values which are in Q2.6 and Q1.7 format. The integer part of the product will be sum of integer bits taken by each operand, fractional part will be sum of fractional bits of each operand.

Therefore => Q2.6 X Q1.7
=> Q3.13

Word length:

WL required to store the product will be the sum of the WL’s of each operand. In the above example WL of each operand taken as 8-bit so the resultant will be twice the WL of any of the operands WL i.e. WLR = 16-bit

Saturation:

If two operands are of B-bits the resultant product will be of (2 x B) bits. If you want to store (2 x B) bits in X bits data type, you need to saturate the result and then store it in the given data type.

Examples to come

Fixed point arithmetic - Fixed point addition

Introduction:

We know that a given DSP has a finite word length and each number has to be represented in that word length only. Due to the finite word length two numbers of different precision i.e. two numbers in different formats can’t be added directly. Hence given numbers for scaled down to same format.
Scaling is a process that is used to fit a given number into the specified format by adjusting the dynamic range.

During addition if the result is beyond the most positive value in that range or if the result is below the most negative value in that range then overflow occurs. Hence when the result is out of range, normalization or saturation is used. Normalization is a process that takes care of overflow situation by scaling the result. Saturation is a process by which the maximum positive value or the maximum negative value is taken based on the overflow condition.

Fixed point addition:

Let us consider operands with A-Q13 and B-Q11 format respectively.

Note: Decimal point in Q-format notation is just an imaginary point.

**For fixed point addition two operands should be in same Q-format.

Steps to perform fixed point addition

1. Convert float to fixed point.

2. Make two operands in same Q-format.

3. Finally, performing arithmetic addition.

Convert float to fixed point

Example:

float a = 12.36, b = 3.12;

1^st Method

1. Normalizing two float values with no. of bits required to represent integer part Q_I

If ( Q_Ia > Q_Ib)

{

a = a / (2 ^ Q_Ia);

b = b / (2 ^ Q_Ia);

}

else

{

a = a / (2 ^ Q_Ib);

b = b / (2 ^ Q_Ib);

}

Q_Ia = 4-bits, Q_Ib = 2-bits (Q_Ia > Q_Ib)

a_norm = 12.36 / (2 ^ 4); a = 0.7725;

b_norm = 3.12 / (2 ^ 4); b = 0.195;

2. Converting normalized values to fixed point (Assuming word length WL = 16-bit)

a_fixed = (a_norm X (2^15)); = (0.7725 * 2^15) = 25313;

b_fixed = (b_norm X (2^15)); = (0.195 * 2^15) = 6389;

Now both the operands are in the same Q-format i.e. Q15 format.

3. Finally, perform arithmetic addition between the two operands which are in same Q-format.

addition_fixed = (25313 + 6389)

= (31702)

2nd Method

1. Find the max of Q_I between two operands

Q_Ia = 4, Q_Ib = 2.

2. Assuming WL is 16-bit,

Q-format = (16 – (4 + 1 sign bit)) = 11bit

a_fixed = (12.36 * 2 ^ 11) = (short)25313.

b_fixed = (3.12 * 2 ^ 11) = (short) 6389.

3. Finally, perform arithmetic addition b/n two operands which are in same Q- format.

addition_fixed = (25313 + 6389)

= (31702)

Steps to perform fixed point subtraction:

Follow the Steps 1 & 2 as explained for fixed point addition

Finally perform subtraction on two fixed point operands which are in same format.

Format for representing fixed point numbers

Fixed point numbers and number notation:

In Fixed-point numbers the imaginary binary point plays a significant role while interpreting numbers. When dealing with numbers the digits to the left of imaginary binary point represents the integer and the digits to the right of binary point represents fractions. Here we study a format for representing fixed point numbers which we will be using here on.

Q notation (QF):

In this notation the total numbers of fractional bits represent the number format. One bit is assumed for Sign (MSB always). The number format representation doesn’t convey any information regarding the word length.
The notation QF means a number with F bits dedicated for the fractional part. If W represents the word length of the processor, then QF number means, F Fractional Bits, W - (F+1) Integer Bits and
1 Sign Bit.
If F is the number of bits required to represent fraction part QF then
Number of bits to represent Integer Bits = W - (F+1)
By default '1' bit is used for sign representation.

For example,

Q15 means 15 Fractional bits, one Sign Bit and zero bits for integer part. Ex:0.435.
Q14 means 14 Fractional bits, one Sign Bit and 1 bit for integer part. Ex: 1.435
Q1 means 1 Fractional bit, one Sign Bit and 14 bits for integer part. Ex: 16383.435

How to select a Q-point format, to represent a float value in fixed point format

For example consider a float value 12.435

Steps involved in selecting Q-format for the above given value:

1. Calculate the number of bits needed to represent integer part (QI) of the given float value (4 bits).
2. Calculate the word length needed to represent the float value in fixed point
a. 1 bit for sign representation
b. 4 bits for integer representation.
c. QF = 10, from equation discussed previous post ceiling (log2 (1/ ε)) for given example , ε = 0.0001
d. WL = QI + QF + S, therefore WL = (4 + 10 + 1) = 15 (bits) to guaranty both range and resolution.
3. So Q-format need to represent above float numbers is Q10 format.

Simple Procedure to identify Q-format for given float point value

Assuming that the word length used by the programmer to represent a float value in fixed point format is WL = 16. (This assumption should be made by programmer depending up on the maximum range of input float value).

Note: Float to fixed conversion comes at cost of precision loss.

Consider above example:

Float value = 12.435

(12)10 -> (1100)2 -> 4 bits are required to represent integer part.
Sign -> 1 bit for sing representation

As we have seen before, number of fractional bits taken by float value is the Q-point format with which the float value is represented in fixed point.

Q-format for the above float value is = (16 – (4 + 1)) i.e. QF = 11

So the above float value can be represented in Q11 format.

Fixed Point Representation: Q-formats

Q-Formats

Fixed point number representations are termed as Q-point format(As IEEE 754 standard for floating point representation).

Let us start with the basic notation of a Q-point for a fixed point number.

Convention is as follows

Q [QI]. [QF]

QI -> Integer bits

QF -> Fraction bits

So sum of QI and QF gives the total number of bits that is needed to represent a number in fixed point format.

QI + QF = Word length and this word length corresponds to variable widths supported on various processors. Typical word lengths would be 8, 16 and 32-bit.

For example: Q2.6 number would be an 8-bit number with 2 integer bits and 6 fraction bits.

Fixed point range – Integer portion

The range of a floating point variable in an algorithm sets the number of bits (QI) required to represent the integer portion of the number.

This relationship for unsigned numbers is defined by the equations:

0 ≤ α ≤ 2^QI

Where α is floating point variable.

(We will see few examples on this in coming sessions)

If floating point number is a singed value.

-2^(QI-1) ≤ α ≤ 2^(QI-1)

Where α is floating point variable.

Resolution of a fixed point number – Fractional portion:

The resolution of a fixed point number is set by the remaining fraction bits (QF) for a given word length (WL) for the variable. For a given word length and dynamic range of a variable the resolution is limited. If higher resolution is needed for a given range then the WL of the variable must be increased to provide this resolution.

The resolution ε, of a fixed point number is as follows

ε = 1/ (2^QF)

Therefore the number of fractional (Q_F) bits required for a particular resolution is defined by the equation.

Q_F= log₂ (1/ ε)

However since Q_Fis a integer values only, the results of logarithm result:

Q_F= celling (log₂ (1/ ε))

We will discussion more and in detail math on fixed point numbers in coming posts.

Thanks and have a nice reading

International Conference In MIT,Manipal

International Conference In MIT,Manipal (Among Top 10 Deemed Universities)

An International conference is to be held in MIT,Manipal on December from 10 to 12 (more details in our conference website www.icedsp.org). MIT would be grateful to you if you can fund some sponsorship for the International conference. I am sure your esteemed organisation will benefit from the same.Your help in this regard is much appreciated. Please see the attached file for possible sponsor options.
For further details, please contact Dr. Somashekara Bhat, Department of Electronics and Communication Engineering , MIT Manipal.

A Machine Learning and Data Science Blog

Introduction to image registration

Image file formats and file hedaers

Fixed math - Fixed point multiplication

Fixed point arithmetic - Fixed point addition

Format for representing fixed point numbers

Fixed Point Representation: Q-formats

International Conference In MIT,Manipal

Related Posts

Twitter Updates

Random Posts

Disclaimer

Recent Comments