Innotor July 10, 2015 at 10:13

Side view: IEEE754 standard

From the sandbox

A lot of work has been devoted to the issues of representing real numbers in the floating point / comma format, which is enshrined in the IEEE754 standard. Including on Habrahabr. Not being a programmer, the author tried to deal with this beast from the point of view of simple school mathematics. Based not on the formats approved in the standard, but on the natural notions of numbers. It is possible that such an outside view will be of interest to professional programmers as well. This is especially true for issues related to denormalized numbers.

1. NATURAL AND EXPONENTIAL NUMBER RECORDING FORMS

It is known from mathematics that any real number F in a positional number system with base q is written on paper as a sequence of numbers. The weight of a digit depends on its position in the number. The base of the q system is equal to the number of digits (characters of its alphabet) and determines how many times the values of the digits of the adjacent digits of the number differ. Such a notation of a number is called natural and looks as follows:

F = c_ (L-1,) c_ (L-2) ... c_ (0.) D_ (0) ... d_ (N-2,) d_ (N-1,) (1)

Where c_ (L-1,) c_ (L-2) ... c_0 are the digits of the integer part, and d_ (0) ... d_ (N-2,) d_ (N-1) are the digits of the fractional part of the number. The number can consist of an arbitrarily large number of significant digits L of the integer part and N digits of the fractional part.
If the point in the number F represented by the expression (1) is shifted to the left by h bits, then we will get a new number M, which is connected with the initial number by the formula representing the exponential dependence:

M = F • q ^ (- h)

The value of the number F for this decreases h times. So that the number does not change, it is multiplied by q ^ h. Thus, the number written in the natural form (1) can be represented in its exponential form equivalent to it:

F = M • q ^ h

If the point in the number F represented by expression (1) is shifted to the right by h digits, we will get a new the number M, which is associated with the original number by the formula:

M = F • q ^ h

The value of the number F will increase h times. So that the number does not change, it is multiplied by the quantity q ^ (- h). Thus, for the case under consideration, the number written in the natural form (1) can be represented in the following exponential form equivalent to it:

F = M • q ^ (- h)

In the general case, any real number written in the natural form ( 1), can be written in the equivalent exponential form as follows:

F = M ∙ q ^ (± h) (2)

where M is the number (1) with the offset point at h positions in one direction or another. The number M in such a notation is called the mantissa of the number, and q ^ (± h) is the characteristic of the number with the order ± h, which is also called the exponent in the literature. The sign and magnitude of the order of h compensate for the displacement of the point relative to its initial position in number (1). Both entries (1) and (4) are the entries of the same number in different ways.

The number (1) has L + N digits. Since in the natural representation of the number (1) the number of digits L of the integer part and N digits of the fractional part can have an arbitrarily large value, the number M in (2) can also have an arbitrarily large number of digits. In the general case, the number of digits of the number M in (2) can be infinite. For example, when the number is a periodic fraction, or the number is irrational. In practice, we are dealing with a limited number of digits to represent a real number in natural form. No matter how many digits of a number we write on the right, sooner or later we must limit the number of digits of the represented number. If only because there will be nowhere to write. As a result, the number is first limited, and then rounded to the rightmost digit acceptable for the given task. In this case, of course, the accuracy of the representation of this number is lost. We do not touch upon questions of the accuracy of the representation of numbers in mathematics. A huge number of works are devoted to this issue. We only note that the accuracy of the representation of the number is chosen within reasonable limits and therefore the real number is always written with a bit limit. Thus, strictly speaking, it becomes a rational number. In computer literature, numbers that have a fractional part are called real numbers. We will also adhere to this terminology. strictly speaking, it becomes a rational number. In computer literature, numbers that have a fractional part are called real numbers. We will also adhere to this terminology. strictly speaking, it becomes a rational number. In computer literature, numbers that have a fractional part are called real numbers. We will also adhere to this terminology.

In mathematics, as a rule, the representation of numbers in an exponential form is resorted to when the number written in its natural form (1) has insignificant zeros. To shorten the record and not write repeated insignificant digits, use the notation of the number in exponential form (2). Then the order of the characteristic h indicates the number of insignificant zeros before or after the point. In a more general case, the number h with a sign, as we saw above, indicates the number of displacements of a point relative to its initial position in the number. In any case, when the number of shifts h is indicated, the reference point with respect to which the separation point is shifted in the number is always known.

2. REPRESENTATION OF NUMBERS IN THE MACHINE WORD

In the computing device, a limited bit space is allocated for recording the number. Therefore, certain restrictions are imposed on the numbers written in the machine word, which determine the accuracy of the representation of numbers and the range of values they accept.

The binary number, represented in exponential form, is written in the computer as a machine word, divided into special areas. The structure of a machine word can be schematically represented as follows:

In this word, K digits are allocated for recording the mantissa of M, R digits for recording the order of h characteristics and one digit is allocated for recording the sign S of the number and z of the order sign. The machine space allocated for recording the mantissa of a number will be called the area of the machine mantissa (OMM), and the number recorded in this area will be called the machine mantissa. Similarly, the space allocated in the machine for recording the order of characteristics will be called the domain of machine order (OMP) of the characteristic, and the number recorded in this area will be called machine order. If the point is explicitly contained in the OMM, then the numbers represented in this format are called fixed-point numbers. Next, we will consider numbers written in exponential form (2). The numbers presented in this format are also called floating-point numbers.

3. NORMALIZATION OF NUMBERS

As already noted above, when converting a number written in its natural form to an exponential form, a point in a number of the form (1) can be shifted by an arbitrary number of digits to the right or left. And so that the value of the number does not change, the order of the characteristic of the exponential number must be adjusted by the number of offsets. Obviously, this creates a multiplicity of representations of the same number, written in exponential form.

Take the binary number 0.001001 and write it exponentially in a machine word, in which OMM has 3 digits. In the case when it is assumed that the machine mantissa is represented as a regular fraction, we will have the following possible options for writing this number: 0.1001 • 2 ^ (- 2) = 0.100 • 2 ^ (- 2) = 0.010 • 2 ^ (- 1) = 0.001 • 2 ^ 0. In all these cases, the least significant bit of the recorded number was lost because it went beyond the boundaries of the OMM discharge grid. So, we got a record of the same number in different ways.

There is ambiguity in the representation of numbers in a machine word. We must indicate to the machine a selection criterion by which one or another form of notation of a number in a machine word is preferred. Writing an exponential number in a format in which the mantissa of a number is represented uniquely is called normalization.

Currently, two options for normalizing numbers are most often considered. In the first version [1], before writing to the machine word, the number is represented as a binary fractional number, in which the unit is immediately after the point. With this normalization, the binary mantissa in the form of a regular fraction lies in the range 0.1≤M <1. Another normalization option is as follows. The real number before being written to the machine word is reduced to the form in which the mantissa is a mixed fraction, in which a significant digit is necessarily present in the lower order of the integer part. For a binary number, this number is one. This normalization option is enshrined in the IEEE754 standard.

4. VALID NUMBERS IN THE PRESENTATION OF THE IEEE754 STANDARD

Currently, in computer arithmetic, to operate with real numbers, the IEEE754 standard is widely used [2]. This standard introduces a class of normalized numbers, which allows you to solve two problems. One of them is an ambiguous representation of floating point numbers. And the second problem is getting the ability to represent numbers in a wide range of values. To solve these problems, it was proposed, before writing a real number in the machine word, to bring it to normalized form.

Consider the features of the representation of real floating point numbers in the IEEE754 standard.

In accordance with the standard, all numbers in a machine word are presented in normalized form. To do this, they are transformed to the form:

F = s ∙ 2 ^ (± h) ∙ 1.M

Here 1.M is the mantissa, consisting of a unit in the integer part and a number M, which is written in OMM immediately after the unit. The mantissa values in normalized numbers are in the range:

1 ≤ 1.M <2,

and the values of the orders of characterization of numbers are in the range - (B-1) ≤ h ≤ B. Here B is the maximum number that can be written in the OMP of a machine word.

The binary normalized number in a machine word with a K-bit OMM is schematically as follows:

The arrangement of the digits for the sign of order z and the sign of the number S in the machine word can be arbitrary. This does not affect the nature of the representable numbers. As can be seen from this figure, the K-th discharge in the mantissa of the normalized number is always equal to one. This rank is present in the machine word virtually. In fact, it is not in the discharge grid. However, a virtual unit is always taken into account when representing numbers and when performing mathematical operations on them. It is assumed that in the normalized number, the point in the mantissa is located to the right of the unit.

In the standard, in order to save on the sign of the order, an offset representation of the orders of the characteristics of the numbers written in the machine word is used. For simplicity of presentation, the values of the orders of characteristics are presented hereinafter without bias.

With the normal representation of numbers, a problem immediately arises related to the absence of zero. In normalized numbers, zero does not fall into the mantissa range. This fact does not allow to obtain a zero calculation result. In mathematical operations on normalized numbers in a computer, zero cannot be explicitly obtained, but its attribute is obtained.

And the second problem that arises during the normalization of numbers is the limitation of the range of representation of small numbers, due to the fact that the mantissa of normalized numbers do not exceed unity. This forced the authors of the standard to introduce a class of denormalized numbers, which significantly complicated the algorithms for working with such numbers.

The minimum normalized number that can be written in a machine word looks schematically as follows:

This number, taking into account the position of the implicit unit, can be represented by the following expression:

| F_min | = S 2 ^ (- B) (1 + 0) = S 2 ^ (- B)

Where B is the maximum number recorded in the WMD. The number 2 ^ (- B) in the standard is considered a special number, which is taken as ± 0. The sign at zero takes its value in accordance with the value of the sign S of the number. Further, for simplicity, we will consider only positive numbers.

The minimum positive normalized number in the standard is considered to be a number of the form:

Thus, the minimum normalized number in the standard is represented by the formula:

| F_min | = 2 ^ (- (B-1))

The equality of the order of characterization to the number (-B) in the standard is reserved for a special case. The step with which the values of normalized numbers change is equal to ξ = 2 ^ (- B) The
maximum positive normalized number in the standard is presented in the form

This number can be written in the form of the expression:

| F_max | = 2 ^ B • (2-2 ^ (- К))

We calculate the range of representable numbers normalized by the standard:

| F_max | / | F_min | = 2 ^ V • (2-2 ^ (- К)) / 2 ^ (- В + 1) = 2 ^ (2В) • (1 ^ 2 ^ (- (K + 1))) ≈ 2 ^ 2В

As we see that the range of representable normalized numbers is determined mainly by the range of numbers that can be written in WMD.

We also note that even at K = 1, when the mantissa of the number contains only one digit, choosing the appropriate value of B can write either a very large or a very small number. Figuratively speaking, the mantissa in the number represented in an exponential form, by analogy with a geographical map, determines the number of objects that we are considering, and the coefficient q ^ (± h) is a scale factor that determines the distance between these objects. With increasing scale factor q ^ (± h), with a limited area of the map (bit depth OMM), smaller objects (numbers) become distinguishable, and the number of distinguishable objects (numbers) decreases. In the decimal system, often, to describe the quantitative characteristics of physical objects, instead of the coefficient q ^ (± h), prefixes nano, micro, miles, kilos, mega, etc. are used.

In order to expand the range of representable numbers in the direction of values close to zero, the standard introduced a class of denormalized numbers. Denormalized numbers are numbers that are determined by the formula:

F_den = 2 ^ (- (B-1)) • (M • 2 ^ (- K))

In this expression, M is an integer in the range 1≤ M≤ (2 ^ (K-1)). And the mantissa (M • 2 ^ (- К)) is a fractional number that lies in the range

2 ^ (- К) ≤ (M • 2 ^ (- К)) <1 - 2 ^ (- К) (3)

As can be seen from this, there is no implicit unit in the expression for denormalized numbers. At the moment when the characteristic of the normalized number becomes equal to 2 ^ (- B) (special case), further transformations are carried out as if the characteristic of the number is 2 ^ (- (B-1)). And the values of the fractional mantissa lie in the range (3). This allows you to move from the region of normalized numbers to the region of denormalized numbers smoothly, without a jump. In the region of denormalized numbers, the step of changing the numbers becomes ξ = 2 ^ (- (B-1 + K)). When the mantissa value of a denormalized number becomes equal to zero, it is considered that the number is equal to machine zero.

A denormalized positive number in a machine word looks like this:

The minimum positive denormalized number in a machine word looks like:

The formula for the minimum denormalized number will be as follows:

F_ (den min) = 2 ^ (- (B-1 + K)) = 2 ^ (- (B + K-1)) The

maximum positive denormalized number looks in the machine word as follows: : The

formula for calculating the maximum denormalized number is:

F_ (den max) = 2 ^ (- (B-1)) (1 - ^ (- K))

The range of denormalized numbers represented in the standard is:

F_ (den max) / F_ (den min) = 2 ^ (B-1) (1-2 ^ (- K)) / 2 ^ (- (B + K-1)) = 2 ^ (2B-2) (2 ^ K-1)

Below, for example, a table is presented in which the conversion of a sequence of numbers before writing them into a machine word is presented in accordance with the IEEE754 standard. It is assumed that OMM consists of K = 2 digits, plus an implicit unit. In WMD, the maximum number B = 2 can be written. The minimum number at which normalized numbers begin to count is | F_min | = 2 ^ (- 2 + 1) = 2 ^ (- 1) = 0.1. In the table, this area is represented by numbers on a white background. From row 1 to row 3, on a gray background, the table shows the area of denormalized numbers for which F_ (den min) = 2 ^ (- (2 + 2-1)) = 2 ^ (- 3). The step of change for these numbers is ξ = 0.001.

In the 1st column of the table, the steps for changing the number from the minimum to the maximum are numbered. Column 2 contains the numbers that are formed with each new step. The x here means an arbitrary digit, zero or one. The numbers denoted by x do not participate in the formation of the machine word mantissa. Column 3 presents the mantissa of the number, after its normalization. Column 4 presents the values of the orders of characteristics of numbers after their normalization. Column 5 presents the values of the machine orders of the characteristics of the numbers recorded in the machine word. Column 6 shows the denormalized numbers that are written to OMM. In column 7, the unique denormalized numbers that are written in the machine word are numbered. And in column 8, unique normalized numbers are numbered.

As follows from the table of the values of the numbers written in the machine word, the range of representable binary normalized numbers is 1.11 • 2 ^ 2 / 1.0 • 2 ^ (- 1) = 7 / 0.5 = 14. At the same time, the range of combined normalized and denormalized numbers will be 1.11 • 2 ^ 2 / 0.01 • 2 ^ (- 1) = 7 / 0.125 = 56. It can be seen that the range of representable numbers, defined as the ratio of the maximum representable number to the minimum, does not coincide with the number of representable numbers.

Looking at the table, you can see that while the machine order is equal to the minimum value (h = -1), an increase of 1 the mantissa value leads to an increase in the number recorded in the machine word also by 1. After the number in page 2 becomes equal binary number 0.111, its next increment of 1 gives the number 1.000, which does not fit into the OMM bitmap. Therefore, we shift the point in the mantissa of the number by one digit to the left and increase the machine order by one. Further, the change in machine numbers occurs with a doubled step value. With each increase in the number h by one, the distance between adjacent values of machine numbers is doubled.

LITERATURE

1. wiki.mvtom.ru/index.php/ Forms_of_view_of_number_of_computers
2. IEEE Standard for Binary Floating-Point Arithmetic. Copyright 1985 by The Institute of Electrical and Electronics Engineers, Inc 345 East 47th Street, New York, NY 10017 USA.
3. www.softelectro.ru/ieee754.html
4. neerc.ifmo.ru/wiki/index.php?title= Real_Numbers_representation&printable=yes

Tags: