Arithmetic of arbitrary precision in Erlang
Even schoolchildren are aware of the existence of various number systems and the fact that not every finite decimal fraction is a finite fraction in a binary number system. Few people think that because of this fact, operations on float and double are not precise.
If we talk about Erlang, then it, like many other languages, implements the IEEE754 standard for float, while the standard type Integer in Erlang is implemented using arbitrary precision arithmetic. However, I would like to have not only bigint, but also the possibility of operating with rational, complex and floating point numbers with the necessary accuracy.
The article presents a minimal overview of the theory of coding floating-point numbers and the most vivid examples of the emerging effects. The solution that provides the necessary accuracy of operations through the transition to a fixed-point representation is designed as an EAPA (Erlang Arbitrary Precision Arithmetic) library, designed to meet the needs of financial applications developed at Erlang / Elixir.
Standards, standards, standards ...
To date, the main standard of binary floating point arithmetic is widely used in IEEE754 engineering and programming. It defines four presentation formats:
- single-precision 32 bits
- with double precision (double-precision) 64 bits
- with single extended precision (single-extended precision)> = 43 bits (rarely used)
- Double extended precision (double-precision precision)> = 79 bits (usually 80 bits are used)
and four rounding modes: - Rounding tending to the nearest whole.
- Rounding to zero.
- Rounding aiming at + ∞
- Rounding toward -∞
Most modern microprocessors are made with the hardware implementation of the representation of real variables in the IEEE754 format. Presentation formats limit the size limit of a number, and rounding modes affect accuracy. Programmers often cannot change the behavior of hardware and implementations of programming languages. For example, the official Erlang implementation stores float in 3 words on a 64-bit machine and in 4 words on a 32-bit machine.
As mentioned above, the numbers in the IEEE754 format are a finite set, on which an infinite set of real numbers is mapped, so the original number can be represented in the IEEE754 format with an error.
The main mass of numbers when mapping to a finite set has a stable and small relative error. So, for float it is 11.920928955078125e-6%, and for double - 2.2204460492503130808472633361816e-14%. In the life of programmers, most of the everyday tasks being solved allow us to neglect this error, although it should be noted that in simple tasks it is possible to step on a rake, since the magnitude of the absolute error can reach 10 31 and 10 292 for float and double, respectively, causing difficulties in calculations.
Effects illustration
From general information to the case. Let's try to reproduce the resulting effects in Erlang.
All the examples below are designed as ct-tests.
Rounding and loss of accuracy
Let's start with the classics - the addition of two numbers: 0.1 + 0.2 =?
t30000000000000004(_)->
["0.30000000000000004"] = io_lib:format("~w", [0.1 + 0.2]).
The result of the addition is slightly different from the intuitively expected, and the test passes successfully. Let's try to achieve the right result. Rewrite the test using EAPA:
t30000000000000004_eapa(_)->%% prec = 1 symbols after coma
X = eapa_int:with_val(1, <<"0.1">>),
Y = eapa_int:with_val(1, <<"0.2">>),
<<"0.3">> = eapa_int:to_float(1, eapa_int:add(X, Y)).
This test also succeeds, indicating that the problem has been fixed.
Let's continue the experiments, add a very small amount to 1.0:
tiny(_)->
X = 1.0,
Y = 0.0000000000000000000000001,
1.0 = X + Y.
As you can see, our increase went unnoticed. We are trying to correct the problem, illustrating one of the library's possibilities in the way - automatic scaling:
tiny_eapa(_)->
X1 = eapa_int:with_val(1, <<"1.0">>),
X2 = eapa_int:with_val(25, <<"0.0000000000000000000000001">>),
<<"1.0000000000000000000000001">> = eapa_int:to_float(eapa_int:add(X1, X2)).
Bit Overflow
In addition to problems associated with small numbers, overflow is an obvious and significant problem.
float_overflow(_) ->1.0 = 9007199254740991.0 - 9007199254740990.0,
1.0 = 9007199254740992.0 - 9007199254740991.0,
0.0 = 9007199254740993.0 - 9007199254740992.0,
2.0 = 9007199254740994.0 - 9007199254740993.0.
As can be seen from the test, at some point the difference ceases to be equal to 1.0, which is obviously a problem. EAPA allows you to solve this problem:
float_overflow_eapa(_)->
X11 = eapa_int:with_val(1, <<"9007199254740992.0">>),
X21 = eapa_int:with_val(1, <<"9007199254740991.0">>),
<<"1.0">> = eapa_int:to_float(1, eapa_int:sub(X11, X21)),
X12 = eapa_int:with_val(1, <<"9007199254740993.0">>),
X22 = eapa_int:with_val(1, <<"9007199254740992.0">>),
<<"1.0">> = eapa_int:to_float(1, eapa_int:sub(X12, X22)),
X13 = eapa_int:with_val(1, <<"9007199254740994.0">>),
X23 = eapa_int:with_val(1, <<"9007199254740993.0">>),
<<"1.0">> = eapa_int:to_float(1, eapa_int:sub(X13, X23)).
Dangerous reduction
The following test demonstrates the occurrence of a dangerous reduction. This process is accompanied by a catastrophic decrease in the accuracy of calculations in operations where the resulting value is much less than the input. In our case, the result of subtraction is 1.
Let us show that this problem is present in Erlang:
reduction(_)->
X = float(87654321098765432),
Y = float(87654321098765431),
16.0 = X-Y. %% has to be 1.0
It turned out 16.0 instead of the expected 1.0. Let's try to correct this situation:
reduction_eapa(_)->
X = eapa_int:with_val(1, <<"87654321098765432">>),
Y = eapa_int:with_val(1, <<"87654321098765431">>),
<<"1.0">> = eapa_int:to_float(eapa_int:sub(X, Y)).
Other features of floating point arithmetic in Erlang
Let's start with ignoring negative zero.
eq(_)->true = list_to_float("0.0") =:= list_to_float("-0.0").
Just want to say that EAPA retains this behavior:
eq_eapa(_)->
X = eapa_int:with_val(1, <<"0.0">>),
Y = eapa_int:with_val(1, <<"-0.0">>),
true = eapa_int:eq(X, Y).
as it is permissible. In Erlang, there is no intelligible syntax and processing of NaN and infinities, which gives rise to a number of features, for example, like this:
1> math:sqrt(list_to_float("-0.0")).
0.0
The next point is the peculiarity of handling large and small numbers. Let's try to reproduce for small ones:
2> list_to_float("0."++lists:duplicate(322, $0)++"1").
1.0e-3233> list_to_float("0."++lists:duplicate(323, $0)++"1").
0.0
and for large numbers:
4> list_to_float("1"++lists:duplicate(308, $0)++".0").
1.0e3085> list_to_float("1"++lists:duplicate(309, $0)++".0").
** exception error: bad argument
Let's give a couple more examples for small numbers:
6> list_to_float("0."++lists:duplicate(322, $0)++"123456789").
1.0e-3237> list_to_float("0."++lists:duplicate(300, $0)++"123456789").
1.23456789e-301
8> 0.123456789e-100 * 0.123456789e-100.
1.524157875019052e-2029> 0.123456789e-200 * 0.123456789e-200.
0.0
The examples above confirm the truth for Erlang projects: money cannot be considered in IEEE754.
EAPA (Erlang Arbitrary-Precision Arithmetic)
EAPA - NIF extension written in Rust. At the moment, the EAPA repository provides the most simple and convenient interface for working with fixed-point numbers. Among the features of eapa_int are the following:
- No effects IEEE754 encoding
- Support large numbers
- Adjustable accuracy up to 126 decimal places. (in current implementation)
- Autospeiling
- Support for all basic operations on numbers
- More or less complete testing, including property based.
Interface eapa_int
:
with_val/2
- translation of a floating-point number into a fixed representation, which can also be safely used in json, xml.to_float/2
- conversion of a number with a fixed point to a floating-point number with a given accuracy.to_float/1
- conversion of a number with a fixed point to a floating point number.add/2
- the sum of two numberssub/2
- differencemul/2
- multiplicationdivp/2
- divisionmin/2
- minimum of numbersmax/2
- maximum of numberseq/2
- check of equality of numberslt/2
- check that the number is lesslte/2
- verification is less thangt/2
- check that the number is greatergte/2
- check is more than equal
EAPA code can be found in the repository https://github.com/Vonmo/eapa
When should i use eapa_int? For example, if your application works with money or you need to conveniently and accurately perform computational operations on numbers of the form 92233720368547758079223372036854775807.92233720368547758079223372036854775807, you can safely use EAPA.
Like any solution, EAPA is a compromise. We obtain the required accuracy by sacrificing memory and computation speed. Performance tests and statistics collected on real systems show that most of the operations are performed in the range of 3-30 μs. This point also needs to be considered when choosing an interface with a fixed point EAPA.
Conclusion
Of course, it’s not always necessary to solve similar problems on Erlang or Elixir, but when a problem arises and a suitable tool is not found, you have to invent a solution.
This article is an attempt to share with the community a tool and experience, in the hope that for someone this library will be useful and help save time.
How do you count money in Erlang?
PS Working with rational and complex numbers, as well as native access to Integer, Float, Complex, Rational arbitrary precision types will be covered in the following publications. Do not switch!
Materials on the topic: