diff options
author | Rasmus Munk Larsen <rmlarsen@google.com> | 2019-08-12 13:53:28 -0700 |
---|---|---|
committer | Rasmus Munk Larsen <rmlarsen@google.com> | 2019-08-12 13:53:28 -0700 |
commit | a3298b22ecd19a80a2fc03df3d463fdb04907c87 (patch) | |
tree | d6d9788ebd3a55539649240aa59a22479160a20a /Eigen/src/Core/arch/AVX | |
parent | d55d392e7b1136655b4223bea8e99cb2fe0a8afd (diff) |
Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments.
Depending on instruction set, significant speedups are observed for the vectorized path:
log1p wall time is reduced 60-93% (2.5x - 15x speedup)
expm1 wall time is reduced 0-85% (1x - 7x speedup)
The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly.
Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM
Diffstat (limited to 'Eigen/src/Core/arch/AVX')
-rw-r--r-- | Eigen/src/Core/arch/AVX/MathFunctions.h | 10 | ||||
-rw-r--r-- | Eigen/src/Core/arch/AVX/PacketMath.h | 2 |
2 files changed, 12 insertions, 0 deletions
diff --git a/Eigen/src/Core/arch/AVX/MathFunctions.h b/Eigen/src/Core/arch/AVX/MathFunctions.h index 9f375ed98..c6d3cf6a0 100644 --- a/Eigen/src/Core/arch/AVX/MathFunctions.h +++ b/Eigen/src/Core/arch/AVX/MathFunctions.h @@ -36,6 +36,16 @@ plog<Packet8f>(const Packet8f& _x) { return plog_float(_x); } +template<> EIGEN_DEFINE_FUNCTION_ALLOWING_MULTIPLE_DEFINITIONS EIGEN_UNUSED +Packet8f plog1p<Packet8f>(const Packet8f& _x) { + return generic_plog1p(_x); +} + +template<> EIGEN_DEFINE_FUNCTION_ALLOWING_MULTIPLE_DEFINITIONS EIGEN_UNUSED +Packet8f pexpm1<Packet8f>(const Packet8f& _x) { + return generic_expm1(_x); +} + // Exponential function. Works by writing "x = m*log(2) + r" where // "m = floor(x/log(2)+1/2)" and "r" is the remainder. The result is then // "exp(x) = 2^m*exp(r)" where exp(r) is in the range [-1,1). diff --git a/Eigen/src/Core/arch/AVX/PacketMath.h b/Eigen/src/Core/arch/AVX/PacketMath.h index 7ee9dee10..5233195f3 100644 --- a/Eigen/src/Core/arch/AVX/PacketMath.h +++ b/Eigen/src/Core/arch/AVX/PacketMath.h @@ -65,6 +65,8 @@ template<> struct packet_traits<float> : default_packet_traits HasSin = EIGEN_FAST_MATH, HasCos = EIGEN_FAST_MATH, HasLog = 1, + HasLog1p = 1, + HasExpm1 = 1, HasExp = 1, HasSqrt = 1, HasRsqrt = 1, |