Denormalized floating point numbers

What are denormalized floating point numbers? How are they represented?

If you have noticed, from the previous discussion on floating point representation (Click here - previous discussion) there are few serious concern’s in the IEEE 754 representation itself.

(-1)^S x 2^(E-127) x 1.M

IEEE 754 Sing precision floating point representation

How to represent 0.0 in IEEE 754 floating point representation? It is not possible to represent zero, as the product of power of two and mantissa greater than or equal to one.

So how we represent 0.0 then?

Here is the explanation: In IEEE representation all zero ‘E’ exponent is used to represent numbers close to zero (closer to 2^-126 SP floating point representation), which is the least positive real number in the part of the system that can be represented as discussed in earlier posts (Click here – Earlier posts).

i.e. 0|00000001|00000000000000000000000
S|-----E-----|---------------M---------------|

This kind of numbers (Closer to zero) are represented in slightly different way.

Keeping the exponent always equal to -126, mantissa number greater than or equal to zero and less than one (i.e. 0.M instead of 1.M)

Here is the example how to represent number very close to zero:

Consider => 5 x 2^-129

Mantissa used to represent the above number is as explained

=> [5 / (2^3)] x 2^-126

=> [0.625 x 2^-126]

=> [0.625 x 2^-126]

=> [(1/2 + 1/8) x 2^-126] = (0.101) x 2^-126

So the representation of 5 x 2^-129 is as shown below

0|00000001|01010000000000000000000|
S|--E-8bit---|0.M-----------23bit-----------|


Mantissa less than one are said to be Denormalized number

Denormalized numbers are stored less accurately than normalized numbers.
So, the least positive real number that can be represented is 2-149 as shown below.

For Single precision

(-1) ^S x 2^(E – 127) x 0.M

Substitute S = 0, E-127 = -127 i.e. E = 0; and M (23 bit) i.e. 2^-23

So the least positive real number = 2^-(127 + 23) = 2^-149

i.e. 0|00000000|00000000000000000000001
S|---E-8bit--|0.M----------23bit-------------|

Topics to come: Introduction to Floating point arithmetic’s (addition, multiplication, division)

Previous

Reblog this post [with Zemanta]

No comments:

Post a Comment

Related Posts

Twitter Updates

Random Posts

share this post
Bookmark and Share
| More
Share/Save/Bookmark Share