diff options
author | Rasmus Munk Larsen <rmlarsen@google.com> | 2019-11-15 17:09:46 -0800 |
---|---|---|
committer | Rasmus Munk Larsen <rmlarsen@google.com> | 2019-11-15 17:09:46 -0800 |
commit | f1e83073082f2733eec6235f2fdf251217a54ade (patch) | |
tree | a20a4945bf0083ffe1a4d4a617a7a2c4740ba00a /test/packetmath.cpp | |
parent | 2cb2915f908418c897773e0342f152768c13a0d8 (diff) |
1. Fix a bug in psqrt and make it return 0 for +inf arguments.
2. Simplify handling of special cases by taking advantage of the fact that the
builtin vrsqrt approximation handles negative, zero and +inf arguments correctly.
This speeds up the SSE and AVX implementations by ~20%.
3. Make the Newton-Raphson formula used for rsqrt more numerically robust:
Before: y = y * (1.5 - x/2 * y^2)
After: y = y * (1.5 - y * (x/2) * y)
Forming y^2 can overflow for very large or very small (denormalized) values of x, while x*y ~= 1. For AVX512, this makes it possible to compute accurate results for denormal inputs down to ~1e-42 in single precision.
4. Add a faster double precision implementation for Knights Landing using the vrsqrt28 instruction and a single Newton-Raphson iteration.
Benchmark results: https://bitbucket.org/snippets/rmlarsen/5LBq9o
Diffstat (limited to 'test/packetmath.cpp')
-rw-r--r-- | test/packetmath.cpp | 6 |
1 files changed, 4 insertions, 2 deletions
diff --git a/test/packetmath.cpp b/test/packetmath.cpp index d652082b0..64dd3dbf6 100644 --- a/test/packetmath.cpp +++ b/test/packetmath.cpp @@ -605,9 +605,8 @@ template<typename Scalar,typename Packet> void packetmath_real() } if(internal::random<float>(0,1)<0.1f) - data1[internal::random<int>(0, PacketSize)] = 0; + data1[internal::random<int>(0, PacketSize)] = 0; CHECK_CWISE1_IF(PacketTraits::HasSqrt, std::sqrt, internal::psqrt); - CHECK_CWISE1_IF(PacketTraits::HasSqrt, Scalar(1)/std::sqrt, internal::prsqrt); CHECK_CWISE1_IF(PacketTraits::HasLog, std::log, internal::plog); CHECK_CWISE1_IF(PacketTraits::HasBessel, numext::bessel_i0, internal::pbessel_i0); CHECK_CWISE1_IF(PacketTraits::HasBessel, numext::bessel_i0e, internal::pbessel_i0e); @@ -616,6 +615,9 @@ template<typename Scalar,typename Packet> void packetmath_real() CHECK_CWISE1_IF(PacketTraits::HasBessel, numext::bessel_j0, internal::pbessel_j0); CHECK_CWISE1_IF(PacketTraits::HasBessel, numext::bessel_j1, internal::pbessel_j1); + data1[0] = std::numeric_limits<Scalar>::infinity(); + CHECK_CWISE1_IF(PacketTraits::HasRsqrt, Scalar(1)/std::sqrt, internal::prsqrt); + // Use a smaller data range for the positive bessel operations as these // can have much more error at very small and very large values. for (int i=0; i<size; ++i) { |