Subnormal Numbers and Compiler Optimizations: A Dangerous Combination
Subnormal numbers (previously known as denormal numbers) fill the underflow gap in floating-point arithmetic. Traditionally, an underflow computation is said to occur when the exact result of the calculation is nonzero but is smaller than the smallest normalized floating-point number. Subnormal numbers represent the result of computations that produce very small numbers but are not too small to become zero.
While subnormal numbers are generated frequently in numerical software, combining such numbers with specific compiler optimizations can produce unexpected results.
Simple Example
Let’s examine an simple example where combining subnormal numbers and compiler optimizations produce inconsistencies.
Consider the following function that executes a division operation and makes a decision based on the result of the division:
void compute(double x, double y) {
double z = x/y;
if (z < 1.0)
printf("Correct branch\n");
else
printf("Incorrect branch!!\n");
}
Compiling the code on an x86_64 system with the clang compiler and optimization level -O3
(with and without the –ffast-math
option), produces the expected result. If we use 1.1e-311
and 2.0e-309
for the input of x
and y
, the division produces 0.005499999999998974
, so the code prints:
$ ./test
Correct branch
Note that the inputs x
and y
are subnormal floating-point numbers, but this doesn’t change the function’s behavior.
We now compile the code on a different system. We use an IBM POWER9 system and compile it with the IBM xlc
compiler using -O3
. In this compiler, -O3
automatically enables the non-strict IEEE mode, similar to the optimization mode of –ffast-math
in the clang and gcc compilers. This time, using the same input, we get a different result:
$ ./test
Incorrect branch!!
We now lower down the optimization level in xlc
to –O2
, and this time we get the expected result:
$ ./test
Correct branch
What happened? How can we have a different behavior by simply changing the optimization level from –O3
to –O2
?
To understand what happened, we can look at the assembly code (we use the Linux objdump tool to disassemble the code). Here’s a portion of the assembly code for the –O2
case:
7 0000000000000000 <compute>:
8 0: 00 00 4c 3c addis r2,r12,0
9 4: 00 00 42 38 addi r2,r2,0
10 8: 91 ff 21 f8 stdu r1,-112(r1)
11 c: a6 02 08 7c mflr r0
12 10: 80 00 01 f8 std r0,128(r1)
13 14: c0 11 01 f0 xsdivdp vs0,vs1,vs2
14 18: 00 00 82 3c addis r4,r2,0
15 1c: 00 00 84 e8 ld r4,0(r4)
16 20: 8c 03 01 10 vspltisw v0,1
17 24: 78 23 83 7c mr r3,r4
18 28: e2 03 20 f0 xvcvsxwdp vs1,vs32
19 2c: 58 01 01 f0 xscmpodp cr0,vs1,vs0
20 30: 20 00 81 41 bgt 50 <compute+0x50>
...
The interesting thing to note in the above assembly is that the division operation is in line 13; the xsdivdp
instruction corresponds to a division in the POWER architecture. Let’s see now the assembly code for the -O3
case:
7 0000000000000000 <compute>:
8 0: 00 00 4c 3c addis r2,r12,0
9 4: 00 00 42 38 addi r2,r2,0
10 8: 91 ff 21 f8 stdu r1,-112(r1)
11 c: a6 02 08 7c mflr r0
12 10: 80 00 01 f8 std r0,128(r1)
13 14: 68 11 80 f0 xsredp vs4,vs2
14 18: 00 00 62 3c addis r3,r2,0
15 1c: 00 00 63 e8 ld r3,0(r3)
16 20: 8c 03 01 10 vspltisw v0,1
17 24: 78 1b 64 7c mr r4,r3
18 28: e2 03 00 f0 xvcvsxwdp vs0,vs32
19 2c: 38 01 62 fc fmsub f3,f2,f4,f0
20 30: 88 1d 84 f0 xsnmsubadp vs4,vs4,vs3
21 34: 80 21 61 f0 xsmuldp vs3,vs1,vs4
22 38: 88 19 22 f0 xsmsubadp vs1,vs2,vs3
23 3c: 88 0d 64 f0 xsnmsubadp vs3,vs4,vs1
24 40: 18 19 00 f0 xscmpudp cr0,vs0,vs3
25 44: 1c 00 81 40 ble 60 <compute+0x60>
...
The division operation is gone with the –O3
optimization level, and it was replaced with a reciprocal operation (xsredp
instruction) followed by other operations, including multiplication.
We see that with the higher optimization level (-O3
), the compiler optimizes the division operation and replaces it with potentially faster operations, which is wise and can potentially improve performance. However, this optimization makes the assumption that the inputs are represented as normal floating-point numbers.
We add a printf("z = %.17g\n", z);
statement to check the result of z
with the –O2
and –O3
cases:
With –O2
:
$ ./test
z = 0.0054999999999989736
Correct branch
With –O3
:
$ ./test
z = nan
Incorrect branch!!
The optimized code, which includes a reciprocal operation, produces a NaN; the inputs fall outside of the normal number representations (i.e., they are subnormal numbers), combined with the equivalent code produced by the optimizations results in the code taking an incorrect branch.
Conclusion
If you are using highly optimized floating-point code, it is good to know whether your application is operating on subnormal numbers. Several numerical inconsistencies can be caused by the combination of optimized code and subnormal inputs. Knowing the functions and lines of code that use subnormal numbers is an excellent first step to understand potential numerical inconsistency issues. FPChecker can perform a complete floating-point analysis of applications and report code locations that produce subnormal numbers.