aboutsummaryrefslogtreecommitdiffhomepage
path: root/test/packetmath_test_shared.h
Commit message (Collapse)AuthorAge
* Fix rint SSE/NEON again, using optimization barrier.Gravatar Antonio Sanchez2021-03-05
| | | | | | | | | | | | | | | | | | | | This is a new version of !423, which failed for MSVC. Defined `EIGEN_OPTIMIZATION_BARRIER(X)` that uses inline assembly to prevent operations involving `X` from crossing that barrier. Should work on most `GNUC` compatible compilers (MSVC doesn't seem to need this). This is a modified version adapted from what was used in `psincos_float` and tested on more platforms (see #1674, https://godbolt.org/z/73ezTG). Modified `rint` to use the barrier to prevent the add/subtract rounding trick from being optimized away. Also fixed an edge case for large inputs that get bumped up a power of two and ends up rounding away more than just the fractional part. If we are over `2^digits` then just return the input. This edge case was missed in the test since the test was comparing approximate equality, which was still satisfied. Adding a strict equality option catches it.
* Revert "Fix rint for SSE/NEON."Gravatar Antonio Sánchez2021-03-03
| | | This reverts commit e72dfeb8b9fa5662831b5d0bb9d132521f9173dd
* Fix rint for SSE/NEON.Gravatar Antonio Sanchez2021-03-03
| | | | | | | | | | | | | | It seems *sometimes* with aggressive optimizations the combination `psub(padd(a, b), b)` trick to force rounding is compiled away. Here we replace with inline assembly to prevent this (I tried `volatile`, but that leads to additional loads from memory). Also fixed an edge case for large inputs `a` where adding `b` bumps the value up a power of two and ends up rounding away more than just the fractional part. If we are over `2^digits` then just return the input. This edge case was missed in the test since the test was comparing approximate equality, which was still satisfied. Adding a strict equality option catches it.
* Fix pfrexp/pldexp for half.Gravatar Antonio Sanchez2021-01-21
| | | | | | | | | | The recent addition of vectorized pow (!330) relies on `pfrexp` and `pldexp`. This was missing for `Eigen::half` and `Eigen::bfloat16`. Adding tests for these packet ops also exposed an issue with handling negative values in `pfrexp`, returning an incorrect exponent. Added the missing implementations, corrected the exponent in `pfrexp1`, and added `packetmath` tests.
* Fix MSVC complex sqrt and packetmath test.Gravatar Antonio Sanchez2021-01-08
| | | | | | | | | MSVC incorrectly handles `inf` cases for `std::sqrt<std::complex<T>>`. Here we replace it with a custom version (currently used on GPU). Also fixed the `packetmath` test, which previously skipped several corner cases since `CHECK_CWISE1` only tests the first `PacketSize` elements.
* Clean up packetmath tests and fix various bugs to make bfloat16 pass ↵Gravatar Rasmus Munk Larsen2020-10-09
| | | | (almost) all packetmath tests with SSE, AVX, and AVX512.
* Add missing functions for Packet8bf in Altivec architecture.Gravatar Pedro Caldeira2020-09-08
| | | | | Including new tests for bfloat16 Packets. Fix prsqrt on GenericPacketMath.
* Support BFloat16 in EigenGravatar Teng Lu2020-06-20
|
* Remove packet ops pinsertfirst and pinsertlast that are only used in a ↵Gravatar Rasmus Munk Larsen2020-05-08
| | | | | | | | | | | | | | | | single place, and can be replaced by other ops when constructing the first/final packet in linspaced_op_impl::packetOp. I cannot measure any performance changes for SSE, AVX, or AVX512. name old time/op new time/op delta BM_LinSpace<float>/1 1.63ns ± 0% 1.63ns ± 0% ~ (p=0.762 n=5+5) BM_LinSpace<float>/8 4.92ns ± 3% 4.89ns ± 3% ~ (p=0.421 n=5+5) BM_LinSpace<float>/64 34.6ns ± 0% 34.6ns ± 0% ~ (p=0.841 n=5+5) BM_LinSpace<float>/512 217ns ± 0% 217ns ± 0% ~ (p=0.421 n=5+5) BM_LinSpace<float>/4k 1.68µs ± 0% 1.68µs ± 0% ~ (p=1.000 n=5+5) BM_LinSpace<float>/32k 13.3µs ± 0% 13.3µs ± 0% ~ (p=0.905 n=5+4) BM_LinSpace<float>/256k 107µs ± 0% 107µs ± 0% ~ (p=0.841 n=5+5) BM_LinSpace<float>/1M 427µs ± 0% 427µs ± 0% ~ (p=0.690 n=5+5)
* Bug #1790: Make `areApprox` check `numext::isnan` instead of bitwise ↵Gravatar Christoph Hertzberg2020-01-11
| | | | equality (NaNs don't have to be bitwise equal).
* Added special_packetmath test and tweaked bounds on tests.Gravatar Srinivas Vasudevan2020-01-11
Refactor shared packetmath code to header file. (Squashed from PR !38)