Representation of floating point numbers:
In the  IEEE Single-precision representation of a real number, one bit used to represent sing ,  and it is set 0 for positive number and 1 for negative one. A representation of the exponent is stored in next 8bits and the remaining twenty-three bits are occupied by a representation of the mantissa of the number
Here are some examples:
How to represent real numbers in floating point format:
Examples:
1.    Representing 23/4 in single precision floating point number.
                  =>   23/4 = 5.75
Converting above real number to binary form
                 =>  101.11    (5 in binary 101,   .(2^-1 + 2^-2) = .75)
Representing above binary to SP floating point format (32bit)
                [(-1)^S x 2^(E – 127) x 1.M]    
                =>   1.0111 x 2^2    relating this to above given equation
(Numeric ‘1’ before decimal point is called hidden bit as it is by default given in representation).
Sing                  S = 0;                                   No. of bits used to represent exponent = 1
Exponent         (E – 127) = 2 i.e. E = 129; No. of bits used to represent exponent = 8
Mantissa           M = 0111000….  ;             No. of bits used to represent Mantissa = 23
Finally   5.75 in SP floating point representations is as shown below                                                              0|10000001|01110000000000000000000
Note:  What if the fraction part of a real number cannot be expressed as sum of powers of two (as in the above example .75 = (1/2 + 1/4) ex: 7/5 is exactly 1.4, .4 cannot be expressed in terms of sums of power two, 7/5 has infinity binary expansion 1.011001100110011001100.
In a single precision representation, the expansion is rounded off at the twenty-third digit after the binary point.
2.    Extracting real number from SP floating point number representation
         11000100000100110000000000000000
        1|10001000|0100110000000000000000
        S|-----E----|-------------M---------------|
        Sign = 1 i.e (-1)1 = -1 negative number
        Exponent (10001000) = 127 + e, 136 = 127 + e i.e. exponent = 9;
        Mantissa = 1.01001100000000000000000
    i.e. Mantissa  = (one plus, plus no one halves, plus one quarter (/14), plus no one eight, plus no one sixteenth, plus one thirty second, plus one sixty fourth,…all zeros)
                       =>    (1 + 1/4 + 1/32 + 1/64)  = X
                       =>    (64 + 16 + 2 + 1) = X x 64;
                       =>    X = 83/64;
       So the complete number = -(83/64) x 29  = -664.00;
Introduction to Floating point representation IEEE 754
Ask your questions below.
Previous  Next
No comments:
Post a Comment