diff options
author | Rasmus Munk Larsen <rmlarsen@google.com> | 2020-11-24 20:53:07 +0000 |
---|---|---|
committer | Rasmus Munk Larsen <rmlarsen@google.com> | 2020-11-24 20:53:07 +0000 |
commit | c770746d709686ef2b8b652616d9232f9b028e78 (patch) | |
tree | 624821fa175d8f40cc13886d7483ffd35e9da1e3 /Eigen/src/Core/arch/SSE/PacketMath.h | |
parent | 22f67b59585805fedf86759f7013b2b670f83386 (diff) |
Fix Half NaN definition and test.
The `half_float` test was failing with `-mcpu=cortex-a55` (native `__fp16`) due
to a bad NaN bit-pattern comparison (in the case of casting a float to `__fp16`,
the signaling `NaN` is quieted). There was also an inconsistency between
`numeric_limits<half>::quiet_NaN()` and `NumTraits::quiet_NaN()`. Here we
correct the inconsistency and compare NaNs according to the IEEE 754
definition.
Also modified the `bfloat16_float` test to match.
Tested with `cortex-a53` and `cortex-a55`.
Diffstat (limited to 'Eigen/src/Core/arch/SSE/PacketMath.h')
-rwxr-xr-x | Eigen/src/Core/arch/SSE/PacketMath.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/Eigen/src/Core/arch/SSE/PacketMath.h b/Eigen/src/Core/arch/SSE/PacketMath.h index ef77ab6fa..b68abec64 100755 --- a/Eigen/src/Core/arch/SSE/PacketMath.h +++ b/Eigen/src/Core/arch/SSE/PacketMath.h @@ -267,6 +267,10 @@ template<> EIGEN_STRONG_INLINE Packet16b pset1<Packet16b>(const bool& from) { template<> EIGEN_STRONG_INLINE Packet4f pset1frombits<Packet4f>(unsigned int from) { return _mm_castsi128_ps(pset1<Packet4i>(from)); } template<> EIGEN_STRONG_INLINE Packet2d pset1frombits<Packet2d>(uint64_t from) { return _mm_castsi128_pd(_mm_set1_epi64x(from)); } +template<> EIGEN_STRONG_INLINE Packet4f peven_mask(const Packet4f& /*a*/) { + return Packet4f(_mm_set_epi32(0, 0xffffffff, 0, 0xffffffff)); +} + template<> EIGEN_STRONG_INLINE Packet4f pzero(const Packet4f& /*a*/) { return _mm_setzero_ps(); } template<> EIGEN_STRONG_INLINE Packet2d pzero(const Packet2d& /*a*/) { return _mm_setzero_pd(); } template<> EIGEN_STRONG_INLINE Packet4i pzero(const Packet4i& /*a*/) { return _mm_setzero_si128(); } |