aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
* Allow implicit conversion from bfloat16 to float and doubleGravatar Niels Dekker2020-07-11
| | | | | | Conversion from `bfloat16` to `float` and `double` is lossless. It seems natural to allow the conversion to be implicit, as the C++ language also support implicit conversion from a smaller to a larger floating point type. Intel's OneDLL bfloat16 implementation also has an implicit `operator float()`: https://github.com/oneapi-src/oneDNN/blob/v1.5/src/common/bfloat16.hpp
* Guard operator<< test by EIGEN_NO_IO.Gravatar Rasmus Munk Larsen2020-07-09
|
* Guard operator<< by EIGEN_NO_IO.Gravatar Rasmus Munk Larsen2020-07-09
|
* Add operator<< to print a quaternion.Gravatar Rasmus Munk Larsen2020-07-09
|
* Fix test basic stuffGravatar David Tellenbach2020-07-09
| | | | | | - Guard fundamental types that are not available pre C++11 - Separate subsequent angle brackets >> by spaces - Allow casting of Eigen::half and Eigen::bfloat16 to complex types
* Add operator==/operator!= to Quaternion. Fixes #1876.Gravatar Forrest Voight2020-07-07
|
* Change the sign operator in Eigen to return NaN for NaN arguments, not zero.Gravatar Rasmus Munk Larsen2020-07-07
|
* Make test packetmath C++98 compliantGravatar David Tellenbach2020-07-01
|
* BF16 for scalar_cmp_with_cast_opGravatar Sheng Yang2020-07-01
|
* Delete duplicate test cases in vectorization_logic.cppGravatar Kan Chen2020-07-01
|
* Fix tensor casts for large packets and casts to/from std::complexGravatar Antonio Sanchez2020-06-30
| | | | | | | | | | | | | The original tensor casts were only defined for `SrcCoeffRatio`:`TgtCoeffRatio` 1:1, 1:2, 2:1, 4:1. Here we add the missing 1:N and 8:1. We also add casting `Eigen::half` to/from `std::complex<T>`, which was missing to make it consistent with `Eigen:bfloat16`, and generalize the overload to work for any complex type. Tests were added to `basicstuff`, `packetmath`, and `cxx11_tensor_casts` to test all cast configurations.
* Fix denormal check pre c++11.Gravatar Antonio Sanchez2020-06-30
| | | | | `float_denorm_style` is an old-style `enum`, so the `denorm_present` symbol only exists in the `std` namespace prior to c++11.
* Report custom C++ flags in CMake testing summaryGravatar David Tellenbach2020-06-30
|
* Remote CI tags to enable shared runnersGravatar David Tellenbach2020-06-29
|
* Pass CMAKE_MAKE_PROGRAM to Fortran language support testGravatar Christoph Grüninger2020-06-27
| | | | | Otherwise the Make (or Ninja) program is used, which is installed system wide.
* Add initial CI configuration file.Gravatar David Tellenbach2020-06-27
| | | | | The initial CI configuration consists of jobs to build and run tests and to build docs.
* Fix packetmath_1 float tests for arm/aarch64.Gravatar Antonio Sanchez2020-06-24
| | | | | | | | | | | | | | | | | | | Added missing `pmadd<Packet2f>` for NEON. This leads to significant improvement in precision than previous `pmul+padd`, which was causing the `pcos` tests to fail. Also added an approx test with `std::sin`/`std::cos` since otherwise returning any `a^2+b^2=1` would pass. Modified `log(denorm)` tests. Denorms are not always supported by all systems (returns `::min`), are always flushed to zero on 32-bit arm, and configurably flush to zero on sse/avx/aarch64. This leads to inconsistent results across different systems (i.e. `-inf` vs `nan`). Added a check for existence and exclude ARM. Removed logistic exactness test, since scalar and vectorized versions follow different code-paths due to differences in `pexp` and `pmadd`, which result in slightly different values. For example, exactness always fails on arm, aarch64, and altivec.
* Replaced call to deprecated 'load' function with appropriate call to 'on'.Gravatar Simon Pfreundschuh2020-06-23
|
* Add missing Packet2l/Packet2ul ops for NEON.Gravatar Antonio Sanchez2020-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | The current multiply (`pmul`) and comparison operators (`pcmp_lt`, `pcmp_le`, `pcmp_eq`) are missing for packets `Packet2l` and `Packet2ul`. This leads to compile errors for the `packetmath.cpp` tests in clang. Here we add and test the missing ops. Tested: ``` $ aarch64-linux-gnu-g++ -static -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" $ arm-linux-gnueabihf-g++ -mfpu=neon -static -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" $ clang++ -target aarch64-linux-android21 -static -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" $ clang++ -target armv7-linux-android21 -static -mfpu=neon -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" ```
* Added missing NEON pcasts, update packetmath tests.Gravatar Antonio Sanchez2020-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | The NEON `pcast` operators are all implemented and tested for existing packets. This requires adding a `pcast(a,b,c,d,e,f,g,h)` for casting between `int64_t` and `int8_t` in `GenericPacketMath.h`. Removed incorrect `HasHalfPacket` definition for NEON's `Packet2l`/`Packet2ul`. Adjustments were also made to the `packetmath` tests. These include - minor bug fixes for cast tests (i.e. 4:1 casts, only casting for packets that are vectorizable) - added 8:1 cast tests - random number generation - original had uninteresting 0 to 0 casts for many casts between floating-point and integers, and exhibited signed overflow undefined behavior Tested: ``` $ aarch64-linux-gnu-g++ -static -I./ '-DEIGEN_TEST_PART_ALL=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" ```
* Support BFloat16 in EigenGravatar Teng Lu2020-06-20
|
* Add Apache 2.0 license text in COPYING.APACHE.Gravatar Rasmus Munk Larsen2020-06-18
|
* Update `things you can do` message using cmake commandsGravatar Nicolas Mellado2020-06-16
| | | | Print cmake commands instead of make commands, which should work for any generator.
* Run two independent chains, when reducing tensors.Gravatar Ilya Tokar2020-06-16
| | | | | | | | | | | | | | | | | | | | | | | | | | Running two chains exposes more instruction level parallelism, by allowing to execute both chains at the same time. Results are a bit noisy, but for medium length we almost hit theoretical upper bound of 2x. BM_fullReduction_16T/3 [using 16 threads] 17.3ns ±11% 17.4ns ± 9% ~ (p=0.178 n=18+19) BM_fullReduction_16T/4 [using 16 threads] 17.6ns ±17% 17.0ns ±18% ~ (p=0.835 n=20+19) BM_fullReduction_16T/7 [using 16 threads] 18.9ns ±12% 18.2ns ±10% ~ (p=0.756 n=20+18) BM_fullReduction_16T/8 [using 16 threads] 19.8ns ±13% 19.4ns ±21% ~ (p=0.512 n=20+20) BM_fullReduction_16T/10 [using 16 threads] 23.5ns ±15% 20.8ns ±24% -11.37% (p=0.000 n=20+19) BM_fullReduction_16T/15 [using 16 threads] 35.8ns ±21% 26.9ns ±17% -24.76% (p=0.000 n=20+19) BM_fullReduction_16T/16 [using 16 threads] 38.7ns ±22% 27.7ns ±18% -28.40% (p=0.000 n=20+19) BM_fullReduction_16T/31 [using 16 threads] 146ns ±17% 74ns ±11% -49.05% (p=0.000 n=20+18) BM_fullReduction_16T/32 [using 16 threads] 154ns ±19% 84ns ±30% -45.79% (p=0.000 n=20+19) BM_fullReduction_16T/64 [using 16 threads] 603ns ± 8% 308ns ±12% -48.94% (p=0.000 n=17+17) BM_fullReduction_16T/128 [using 16 threads] 2.44µs ±13% 1.22µs ± 1% -50.29% (p=0.000 n=17+17) BM_fullReduction_16T/256 [using 16 threads] 9.84µs ±14% 5.13µs ±30% -47.82% (p=0.000 n=19+19) BM_fullReduction_16T/512 [using 16 threads] 78.0µs ± 9% 56.1µs ±17% -28.02% (p=0.000 n=18+20) BM_fullReduction_16T/1k [using 16 threads] 325µs ± 5% 263µs ± 4% -19.00% (p=0.000 n=20+16) BM_fullReduction_16T/2k [using 16 threads] 1.09ms ± 3% 0.99ms ± 1% -9.04% (p=0.000 n=20+20) BM_fullReduction_16T/4k [using 16 threads] 7.66ms ± 3% 7.57ms ± 3% -1.24% (p=0.017 n=20+20) BM_fullReduction_16T/10k [using 16 threads] 65.3ms ± 4% 65.0ms ± 3% ~ (p=0.718 n=20+20)
* Fix pscatter and pgather for Altivec Complex doubleGravatar Pedro Caldeira2020-06-16
|
* Fix unused variable warning on ArmGravatar David Tellenbach2020-06-15
|
* Fix #1818: SparseLU: add methods nnzL() and nnzU()Gravatar Sebastien Boisvert2020-06-11
| | | | | | Now this compiles without errors: $ clang++ -I ../../ test_sparseLU.cpp -std=c++03
* Fix #1911: add benchmark for move semantics with fixed-size matrixGravatar Sebastien Boisvert2020-06-11
| | | | | | | | | | | $ clang++ -O3 bench/bench_move_semantics.cpp -I. -std=c++11 \ -o bench_move_semantics $ ./bench_move_semantics float copy semantics: 1755.97 ms float move semantics: 55.063 ms double copy semantics: 2457.65 ms double move semantics: 55.034 ms
* Remove HasCast and fix packetmath cast tests.Gravatar Antonio Sanchez2020-06-11
| | | | | | | | | | | The use of the `packet_traits<>::HasCast` field is currently inconsistent with `type_casting_traits<>`, and is unused apart from within `test/packetmath.cpp`. In addition, those packetmath cast tests do not currently reflect how casts are performed in practice: they ignore the `SrcCoeffRatio` and `TgtCoeffRatio` fields, assuming a 1:1 ratio. Here we remove the unsed `HasCast`, and modify the packet cast tests to better reflect their usage.
* Fix #1757: remove the word 'suicide'Gravatar Sebastien Boisvert2020-06-11
|
* Implement scalar_cmp_with_cast_opGravatar ShengYang12020-06-09
|
* Fix static analyzer warning in SelfadjointProduct.h.Gravatar Rasmus Munk Larsen2020-06-08
| | | | Fix compiler warnings in GeneralBlockPanelKernel.h.
* Update FindComputeCpp.cmake to fix build problems on WindowsGravatar Thales Sabino2020-06-05
| | | | | - Use standard types in SYCL/PacketMath.h to avoid compilation problems on Windows - Add EIGEN_HAS_CONSTEXPR to cxx11_tensor_argmax_sycl.cpp to fix build problems on Windows
* Revert ".gitlab-ci.yml: initial commit"Gravatar David Tellenbach2020-06-05
| | | | | This reverts commit 95177362edc9c814a102c8a2236695c632892232 to disable GitLab CI temporarily.
* Fix broken packetmath test for logistic on Arm.Gravatar Rasmus Munk Larsen2020-06-04
|
* Fix typo in previous update to generic predux_any.Gravatar Rasmus Munk Larsen2020-06-04
|
* Avoid implicit float equality comparison in generic predux_any, but use ↵Gravatar Rasmus Munk Larsen2020-06-04
| | | | numext::not_equal_strict to avoid breaking builds that compile with -Werror=float-equal.
* Fix compilation error in logistic packet op.Gravatar Rasmus Munk Larsen2020-06-03
|
* Update run instructions for benchCholeskyGravatar n0mend2020-06-01
|
* Bug #1777: make the scalar and packet path consistent for the logistic ↵Gravatar Gael Guennebaud2020-05-31
| | | | function + respective unit test
* Fix #556: warnings with mingwGravatar Gael Guennebaud2020-05-31
|
* Bug #1767: increase required cmake version to 3.5.0Gravatar Gael Guennebaud2020-05-31
|
* Fix #1833: compilation issue of "array!=scalar" with c++20Gravatar Gael Guennebaud2020-05-30
|
* Save one extra temporary when assigning a sparse product to a row-major ↵Gravatar Gael Guennebaud2020-05-30
| | | | sparse matrix
* .gitlab-ci.yml: initial commitGravatar Christoph Junghans2020-05-29
|
* Add support for PacketBlock<Packet8s,4> and PacketBlock<Packet16uc,4> ↵Gravatar Kan Chen2020-05-29
| | | | ptranspose on NEON
* Disable test for 32-bit systems (e.g. ARM, i386)Gravatar Antonio Sánchez2020-05-28
| | | | | | | Both i386 and 32-bit ARM do not define __uint128_t. On most systems, if __uint128_t is defined, then so is the macro __SIZEOF_INT128__. https://stackoverflow.com/questions/18531782/how-to-know-if-uint128-t-is-defined1
* Fix incorrect usage of `if defined(EIGEN_ARCH_PPC)` => `if EIGEN_ARCH_PPC`Gravatar Yong Tang2020-05-28
| | | | | | | | | | | | | | This PR tries to fix an incorrect usage of `if defined(EIGEN_ARCH_PPC)` in `Eigen/Core` header. In `Eigen/src/Core/util/Macros.h`, EIGEN_ARCH_PPC was explicitly defined as either 0 or 1. As a result `if defined(EIGEN_ARCH_PPC)` will always be true. This causes issues when building on non PPC platform and `MatrixProduct.h` is not available. This fix changes `if defined(EIGEN_ARCH_PPC)` => `if EIGEN_ARCH_PPC`. Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Fix #1874: it works on both MSVC 2017 and other platforms.Gravatar Kan Chen2020-05-21
|
* Add pscatter for Packet16{u}c (int8)Gravatar Pedro Caldeira2020-05-20
|