
Loss of precision from Double to Float or “Where did the penny go?”
The conversion of numbers from one type to another is usually carried out in such a way as not to lose extra numbers, i.e. from a smaller type to a more spacious one. But what if the previous developer used conversion from Double to Float and pennies in reports began to disappear?
This article explores the conversion of floating numbers in Java:
Let's see what this transformation leads to and why everything happens this way. After all, it would seem that since the numbers used in the project are far from the maximum values of the float and double types, then converting it from the first to the second should not entail negative consequences in most cases.
Any thoughts are best reinforced with concrete examples, so immediately the code, which was born first on the basis of real numbers, but then, under the influence of a discussion on stackoverflow about such a conversion , it turned into something more intriguing.
To demonstrate that various constructions equally convert double to float, we add various methods and compare the results:
As you can see, the expressions “new Float (d)” and “(float) d” give the same result, because the first uses the second:
If you deal with the function Float.parseFloat, then it sends us through several other functions to the next line:
Which, in exactly the same way, converts a double variable into a float.
Thus, we have seen that, at least in openjdk, the most obvious ways to convert double to float come down to one design:
In each example, we called the Long.toBinaryString functions for Double and Integer.toBinaryString for Float to demonstrate the low-level storage formats of the created variables. An excellent article about this has already been written ( What you need to know about floating point arithmetic , which has become an excellent translation of the English wiki , which is well told about double precision ), so here we will only consider rounding.
The above programs returned the following results:
The double type takes 64 bits, and the type float 32, but we see 63 and 31 characters - this is the cost of implementing output, which ends when only zeros remain. Therefore, these numbers should look like this:
The first bit is the sign of the number. Then 11 (for double) or 8 bits (for float) is the exponent. After - the mantissa, which plays the most interesting role. First impression: all numbers are simply lost after the 23rd bit when converting from double to float. But let's first try to recover these numbers in order to figure everything out in order:
Thus, cutting off the binary 0110 1011 0010 0010 1111 1110 0011 1101 0111 0000 1010 0011 1101 from double the last 29 digits, we get 011 0101 1001 0001 0111 1111 in the float, which gives us a slightly different number.
Thus, conversion from a format of greater accuracy can lead to non-trivial losses in the reliability of the numbers used. As for Java itself, it is better to use the Double type in it, because working with Float is fraught with conversions from Double with losses. And to store money use BigDecimal, so as not to lose a penny.
The OpenJDK project error tracking system found tightly related to this topic:
In the workaround for these tasks it is written:
This article explores the conversion of floating numbers in Java:
99999999.33333333 -> 100000000.0000000
98888888.33333333 -> 98888888.0000000
2974815.78000000 -> 2974815.7500000
Let's see what this transformation leads to and why everything happens this way. After all, it would seem that since the numbers used in the project are far from the maximum values of the float and double types, then converting it from the first to the second should not entail negative consequences in most cases.
Any thoughts are best reinforced with concrete examples, so immediately the code, which was born first on the basis of real numbers, but then, under the influence of a discussion on stackoverflow about such a conversion , it turned into something more intriguing.
public class Main {
static void testDoubleToFloat(double d) {
float f = (float) d;
System.out.println();
System.out.println(String.format("double %.10f\t%s", d, Long.toBinaryString(Double.doubleToRawLongBits(d))));
System.out.println(String.format("float %.10f\t %s", f, Integer.toBinaryString(Float.floatToRawIntBits(f))));
}
public static void main(String[] args) {
System.out.println(String.format("double: %.10f / %.10f", Double.MIN_VALUE, Double.MAX_VALUE));
System.out.println(String.format("float: %.10f / %.10f", Float.MIN_VALUE, Float.MAX_VALUE));
/*
По умолчанию, вычисления с плавающей точкой ведутся с помощью double.
*/
testDoubleToFloat(99999999.0 + 1.0 / 3.0); // Добавим периодичности
testDoubleToFloat(98888888.0 + 1.0 / 3.0); // Вариант без округления девяток
testDoubleToFloat(2974815.78);
testDoubleToFloat(-2974815.78);
}
}
Execution result
Equal to
And for
double: 0.0000000000 / 179769313486231570000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0000000000
float: 0.0000000000 / 340282346638528860000000000000000000000.0000000000
double 99999999.3333333300 100000110010111110101111000001111111101010101010101010101010101
float 100000000.0000000000 1001100101111101011110000100000
double 98888888.3333333300 100000110010111100100111011001011100001010101010101010101010101
float 98888888.0000000000 1001100101111001001110110010111
double 2974815.7800000000 100000101000110101100100010111111100011110101110000101000111101
float 2974815.7500000000 1001010001101011001000101111111
double -2974815.7800000000 1100000101000110101100100010111111100011110101110000101000111101
float -2974815.7500000000 11001010001101011001000101111111
Equal to
/opt/jdk1.7/bin/java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
And for
java -version
java version "1.7.0_25"
OpenJDK Runtime Environment (IcedTea 2.3.12) (7u25-2.3.12-4ubuntu3)
OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
Conversion method
To demonstrate that various constructions equally convert double to float, we add various methods and compare the results:
public class Main {
static void testDoubleToFloat(double d) {
float f = (float) d;
Float f2 = new Float(d);
float f3 = Float.parseFloat(new Double(d).toString());
float f4 = Float.parseFloat(String.format("%.10f", d));
System.out.println();
System.out.println(String.format("double %.10f\t%s", d, Long.toBinaryString(Double.doubleToRawLongBits(d))));
System.out.println(String.format("float %.10f\t %s", f, Integer.toBinaryString(Float.floatToRawIntBits(f))));
System.out.println(String.format("Float %.10f\t %s", f2, Integer.toBinaryString(Float.floatToRawIntBits(f2))));
System.out.println(String.format("float %.10f\t %s", f3, Integer.toBinaryString(Float.floatToRawIntBits(f3))));
System.out.println(String.format("float %.10f\t %s", f4, Integer.toBinaryString(Float.floatToRawIntBits(f4))));
}
public static void main(String[] args) {
System.out.println(String.format("double: %.10f / %.10f", Double.MIN_VALUE, Double.MAX_VALUE));
System.out.println(String.format("float: %.10f / %.10f", Float.MIN_VALUE, Float.MAX_VALUE));
testDoubleToFloat(99999999.0 + 1.0 / 3.0);
testDoubleToFloat(98888888.0 + 1.0 / 3.0);
testDoubleToFloat(2974815.78);
testDoubleToFloat(-2974815.78);
}
}
Execution result
double: 0.0000000000 / 179769313486231570000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0000000000
float: 0.0000000000 / 340282346638528860000000000000000000000.0000000000
double 99999999.3333333300 100000110010111110101111000001111111101010101010101010101010101
float 100000000.0000000000 1001100101111101011110000100000
Float 100000000.0000000000 1001100101111101011110000100000
float 100000000.0000000000 1001100101111101011110000100000
float 100000000.0000000000 1001100101111101011110000100000
double 98888888.3333333300 100000110010111100100111011001011100001010101010101010101010101
float 98888888.0000000000 1001100101111001001110110010111
Float 98888888.0000000000 1001100101111001001110110010111
float 98888888.0000000000 1001100101111001001110110010111
float 98888888.0000000000 1001100101111001001110110010111
double 2974815.7800000000 100000101000110101100100010111111100011110101110000101000111101
float 2974815.7500000000 1001010001101011001000101111111
Float 2974815.7500000000 1001010001101011001000101111111
float 2974815.7500000000 1001010001101011001000101111111
float 2974815.7500000000 1001010001101011001000101111111
double -2974815.7800000000 1100000101000110101100100010111111100011110101110000101000111101
float -2974815.7500000000 11001010001101011001000101111111
Float -2974815.7500000000 11001010001101011001000101111111
float -2974815.7500000000 11001010001101011001000101111111
float -2974815.7500000000 11001010001101011001000101111111
As you can see, the expressions “new Float (d)” and “(float) d” give the same result, because the first uses the second:
/*
* Copyright (c) 1994, 2010, Oracle and/or its affiliates. All rights reserved.
* ORACLE PROPRIETARY/CONFIDENTIAL. Use is subject to license terms.
*/
...
public final class Float extends Number implements Comparable {
...
public Float(double value) {
this.value = (float)value;
}
...
}
If you deal with the function Float.parseFloat, then it sends us through several other functions to the next line:
return (float)Double.longBitsToDouble( lbits );
Which, in exactly the same way, converts a double variable into a float.
Thus, we have seen that, at least in openjdk, the most obvious ways to convert double to float come down to one design:
float f = (float) d;
Float and double storage formats
In each example, we called the Long.toBinaryString functions for Double and Integer.toBinaryString for Float to demonstrate the low-level storage formats of the created variables. An excellent article about this has already been written ( What you need to know about floating point arithmetic , which has become an excellent translation of the English wiki , which is well told about double precision ), so here we will only consider rounding.
The above programs returned the following results:
double 2974815.7800000000 100000101000110101100100010111111100011110101110000101000111101
float 2974815.7500000000 1001010001101011001000101111111
The double type takes 64 bits, and the type float 32, but we see 63 and 31 characters - this is the cost of implementing output, which ends when only zeros remain. Therefore, these numbers should look like this:
double 2974815.7800000000 0 10000010100 0110101100100010111111100011110101110000101000111101
float 2974815.7500000000 0 10010100 01101011001000101111111
The first bit is the sign of the number. Then 11 (for double) or 8 bits (for float) is the exponent. After - the mantissa, which plays the most interesting role. First impression: all numbers are simply lost after the 23rd bit when converting from double to float. But let's first try to recover these numbers in order to figure everything out in order:
- 1st bits 0 => positive signs
- Further exhibitors: 010000010100 2 -1023 10 = 21
and 10010100 2 -127 10 = 21 - Mantissas: 1.0110101100100010111111100011110101110000101000111101 2 * 2 21 - 52 = 2974815.779999749
and 1.01101011001000101111111 2 * 2 21 - 23 = 2974815.75
Thus, cutting off the binary 0110 1011 0010 0010 1111 1110 0011 1101 0111 0000 1010 0011 1101 from double the last 29 digits, we get 011 0101 1001 0001 0111 1111 in the float, which gives us a slightly different number.
Conclusion
Thus, conversion from a format of greater accuracy can lead to non-trivial losses in the reliability of the numbers used. As for Java itself, it is better to use the Double type in it, because working with Float is fraught with conversions from Double with losses. And to store money use BigDecimal, so as not to lose a penny.
PS
The OpenJDK project error tracking system found tightly related to this topic:
- Fix Float.parseFloat to round correctly and preserve monotonicity. - we already figured out why this is happening
- Direct String-to-float conversion does not preserve monotonicity - same thing
In the workaround for these tasks it is written:
CUSTOMER SUBMITTED WORKAROUND:
Avoiding direct String-to-float conversion by using intermediate doubles.
Proposed solutions
- Store money in BigDecimal
banks:BigDecimal bg = new BigDecimal("2974815.78"); System.out.println(bg);
- Currency class by type:
class Currency { long value; ... public double asDouble() { return value / 100.0; } ... <всяческие удобства для работы, в т.ч. свой парсинг строк, например, на StringTokenizer> ... }
I think the decision will entail other errors / features, although it has the right to life.