You are given a floating-point number, e.g. a double type in Java or C++. You would like to convert it to an integer type… but only if the conversion is exact. In other words, you want to convert the floating-point number to an integer and check if the result is exact.
In C++, you could implement such a function using few lines of code:
bool to_int64_simple(double x, int64_t *out) { int64_t tmp = int64_t(x); *out = tmp; return tmp == x; }
The code is short in C++, and it will compile to few lines of assembly. In x64, you might get the following:
cvttsd2si rax, xmm0 pxor xmm1, xmm1 mov edx, 0 cvtsi2sd xmm1, rax mov QWORD PTR [rdi], rax ucomisd xmm1, xmm0 setnp al cmovne eax, edx
Technically, the code might rely on undefined behaviour.
You could do it in a different manner. Instead of working with high-level instructions, you could copy your binary floating-point number to a 64-bit word and use your knowledge of the IEEE binary64 standard to extract the mantissa and the exponent.
It is much more code. It also involves pesky branches. I came up with the following routine:
typedef union { struct { uint64_t fraction : 52; uint32_t exp_bias : 11; uint32_t sign : 1; } fields; double value; } binary64_float; bool to_int64(double x, int64_t *out) { binary64_float number = {.value = x}; const int shift = number.fields.exp_bias - 1023 - 52; if (shift > 0) { return false; } if (shift <= -64) { if (x == 0) { *out = 0; return true; } return false; } uint64_t integer = number.fields.fraction | 0x10000000000000; if (integer << (64 + shift)) { return false; } integer >>= -shift; *out = (number.fields.sign) ? -integer : integer; return true; }
How does it compare with the simple and naive routine?
I just wrote a simple benchmark where I iterate over many floating-point numbers in sequence, and I try to do the conversion. I effectively measure the throughput and report the number of nanosecond per function call. My benchmark does not stress the branch predictor. Unsurprisingly, the naive and simple approach can be faster.
| system | long version | simple version |
|---|---|---|
| ARM M1 + LLVM 12 | 0.95 ns/call | 1.1 ns/call |
| Intel 7700K + LLVM 13 | 1.6 ns/call | 1.2 ns/call |
I find reassuring that the simple approach is so fast. I was concerned at first that it would be burdensome.
It is possible that my long-form routine could be improved. However, it seems unlikely that it could ever be much more efficient than the simple version.
from Hacker News https://ift.tt/3B4Oqcq
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.