eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Fixing a CUDA / P100 regression introduced by PR 181	Deven Desai	2020-08-20
\| \| \| \| \| \|	PR 181 ( https://gitlab.com/libeigen/eigen/-/merge_requests/181 ) adds `__launch_bounds__(1024)` attribute to GPU kernels, that did not have that attribute explicitly specified. That PR seems to cause regressions on the CUDA platform. This PR/commit makes the changes in PR 181, to be applicable for HIP only
*	Add missing inline keyword in Quaternion.h.	Rasmus Munk Larsen	2020-08-14
\|
*	Replace the call to int64_t in the blasutil test by explicit types	David Tellenbach	2020-08-14
\| \| \| \| \| \| \| \| \|	Some platforms define int64_t to be long long even for C++03. If this is the case we miss the definition of internal::make_unsigned for this type. If we just define the template we get duplicated definitions errors for platforms defining int64_t as signed long for C++03. We need to find a way to distinguish both cases at compile-time.
*	bfloat16 packetmath for Arm Neon backend	David Tellenbach	2020-08-13
\|
*	Add support for Bfloat16 to use vector instructions on Altivec	Pedro Caldeira	2020-08-10
\| \| \| \|	architecture
*	Temporarily turn off the NEON implementation of pfloor as it does not work ↵	Zachary Garrett	2020-08-04
\| \| \| \| \| \|	for large values. The NEON implementation mimics the SSE implementation, but didn't mention the caveat that due to the unsigned of signed integer conversions, not all values in the original floating point represented are supported.
*	Fix StlDeque for GCC 10	David Tellenbach	2020-07-29
\| \| \| \| \|	StlDeque extends std::deque by accessing some of its internal members. Since GCC 10 these are not accessible anymore.
*	Fix undefine BF16 union behavior in AVX512.	Teng Lu	2020-07-29
\|
*	Fix clang-tidy warnings in generic bfloat16 implementation	David Tellenbach	2020-07-27
\| \| \| \|	See !172 for related discussions.
*	Fix bfloat16 casts	David Tellenbach	2020-07-23
\| \| \| \| \| \| \|	If we have explicit conversion operators available (C++11) we define explicit casts from bfloat16 to other types. If not (C++03), we don't define conversion operators but rely on implicit conversion chains from bfloat16 over float to other types.
*	Revert change that made conversion from bfloat16 to {float, double} implicit.	Rasmus Munk Larsen	2020-07-22
\| \| \| \|	Add roundtrip tests for casting between bfloat16 and complex types.
*	Fix cast of blfoat16 to std::complex<T>	David Tellenbach	2020-07-22
\| \| \| \|	This fixes https://gitlab.com/libeigen/eigen/-/issues/1951
*	Make sure we take the little-endian path if __BYTE_ORDER__ is not defined.	Rasmus Munk Larsen	2020-07-22
\|
*	Faster conversion from integer types to bfloat16	Niels Dekker	2020-07-22
\| \| \| \| \| \|	Specialized `bfloat16_impl::float_to_bfloat16_rtne(float)` for normal floating point numbers, infinity and zero, in order to improve the performance of `bfloat16::bfloat16(const T&)` for integer argument types. A reduction of more than 20% of the runtime duration of conversion from int to bfloat16 was observed, using Visual C++ 2019 on Windows 10.
*	Avoid division by zero in nonZerosEstimate() for empty blocks.	Rasmus Munk Larsen	2020-07-22
\|
*	Make numext::as_uint a device function.	Rasmus Munk Larsen	2020-07-22
\|
*	user-defined copy operations removed in favor of compiler-generated ones	Alexander Turkin	2020-07-20
\|
*	Avoid undefined behavior by union type punning in float_to_bfloat16_rtne	Niels Dekker	2020-07-14
\| \| \| \| \| \| \| \|	Use `numext::as_uint`, instead of union based type punning, to avoid undefined behavior. See also C++ Core Guidelines: "Don't use a union for type punning" https://github.com/isocpp/CppCoreGuidelines/blob/v0.8/CppCoreGuidelines.md#c183-dont-use-a-union-for-type-punning `numext::as_uint` was suggested by David Tellenbach
*	AVX path for BF16	Sheng Yang	2020-07-14
\|
*	Allow implicit conversion from bfloat16 to float and double	Niels Dekker	2020-07-11
\| \| \| \| \| \|	Conversion from `bfloat16` to `float` and `double` is lossless. It seems natural to allow the conversion to be implicit, as the C++ language also support implicit conversion from a smaller to a larger floating point type. Intel's OneDLL bfloat16 implementation also has an implicit `operator float()`: https://github.com/oneapi-src/oneDNN/blob/v1.5/src/common/bfloat16.hpp
*	Guard operator<< by EIGEN_NO_IO.	Rasmus Munk Larsen	2020-07-09
\|
*	Add operator<< to print a quaternion.	Rasmus Munk Larsen	2020-07-09
\|
*	Fix test basic stuff	David Tellenbach	2020-07-09
\| \| \| \| \| \|	- Guard fundamental types that are not available pre C++11 - Separate subsequent angle brackets >> by spaces - Allow casting of Eigen::half and Eigen::bfloat16 to complex types
*	Add operator==/operator!= to Quaternion. Fixes #1876.	Forrest Voight	2020-07-07
\|
*	Change the sign operator in Eigen to return NaN for NaN arguments, not zero.	Rasmus Munk Larsen	2020-07-07
\|
*	BF16 for scalar_cmp_with_cast_op	Sheng Yang	2020-07-01
\|
*	Fix tensor casts for large packets and casts to/from std::complex	Antonio Sanchez	2020-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \|	The original tensor casts were only defined for `SrcCoeffRatio`:`TgtCoeffRatio` 1:1, 1:2, 2:1, 4:1. Here we add the missing 1:N and 8:1. We also add casting `Eigen::half` to/from `std::complex<T>`, which was missing to make it consistent with `Eigen:bfloat16`, and generalize the overload to work for any complex type. Tests were added to `basicstuff`, `packetmath`, and `cxx11_tensor_casts` to test all cast configurations.
*	Fix packetmath_1 float tests for arm/aarch64.	Antonio Sanchez	2020-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added missing `pmadd<Packet2f>` for NEON. This leads to significant improvement in precision than previous `pmul+padd`, which was causing the `pcos` tests to fail. Also added an approx test with `std::sin`/`std::cos` since otherwise returning any `a^2+b^2=1` would pass. Modified `log(denorm)` tests. Denorms are not always supported by all systems (returns `::min`), are always flushed to zero on 32-bit arm, and configurably flush to zero on sse/avx/aarch64. This leads to inconsistent results across different systems (i.e. `-inf` vs `nan`). Added a check for existence and exclude ARM. Removed logistic exactness test, since scalar and vectorized versions follow different code-paths due to differences in `pexp` and `pmadd`, which result in slightly different values. For example, exactness always fails on arm, aarch64, and altivec.
*	Add missing Packet2l/Packet2ul ops for NEON.	Antonio Sanchez	2020-06-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current multiply (`pmul`) and comparison operators (`pcmp_lt`, `pcmp_le`, `pcmp_eq`) are missing for packets `Packet2l` and `Packet2ul`. This leads to compile errors for the `packetmath.cpp` tests in clang. Here we add and test the missing ops. Tested: ``` $ aarch64-linux-gnu-g++ -static -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" $ arm-linux-gnueabihf-g++ -mfpu=neon -static -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" $ clang++ -target aarch64-linux-android21 -static -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" $ clang++ -target armv7-linux-android21 -static -mfpu=neon -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" ```
*	Added missing NEON pcasts, update packetmath tests.	Antonio Sanchez	2020-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NEON `pcast` operators are all implemented and tested for existing packets. This requires adding a `pcast(a,b,c,d,e,f,g,h)` for casting between `int64_t` and `int8_t` in `GenericPacketMath.h`. Removed incorrect `HasHalfPacket` definition for NEON's `Packet2l`/`Packet2ul`. Adjustments were also made to the `packetmath` tests. These include - minor bug fixes for cast tests (i.e. 4:1 casts, only casting for packets that are vectorizable) - added 8:1 cast tests - random number generation - original had uninteresting 0 to 0 casts for many casts between floating-point and integers, and exhibited signed overflow undefined behavior Tested: ``` $ aarch64-linux-gnu-g++ -static -I./ '-DEIGEN_TEST_PART_ALL=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" ```
*	Support BFloat16 in Eigen	Teng Lu	2020-06-20
\|
*	Fix pscatter and pgather for Altivec Complex double	Pedro Caldeira	2020-06-16
\|
*	Fix unused variable warning on Arm	David Tellenbach	2020-06-15
\|
*	Fix #1818: SparseLU: add methods nnzL() and nnzU()	Sebastien Boisvert	2020-06-11
\| \| \| \| \| \|	Now this compiles without errors: $ clang++ -I ../../ test_sparseLU.cpp -std=c++03
*	Fix #1911: add benchmark for move semantics with fixed-size matrix	Sebastien Boisvert	2020-06-11
\| \| \| \| \| \| \| \| \| \| \|	$ clang++ -O3 bench/bench_move_semantics.cpp -I. -std=c++11 \ -o bench_move_semantics $ ./bench_move_semantics float copy semantics: 1755.97 ms float move semantics: 55.063 ms double copy semantics: 2457.65 ms double move semantics: 55.034 ms
*	Remove HasCast and fix packetmath cast tests.	Antonio Sanchez	2020-06-11
\| \| \| \| \| \| \| \| \| \| \|	The use of the `packet_traits<>::HasCast` field is currently inconsistent with `type_casting_traits<>`, and is unused apart from within `test/packetmath.cpp`. In addition, those packetmath cast tests do not currently reflect how casts are performed in practice: they ignore the `SrcCoeffRatio` and `TgtCoeffRatio` fields, assuming a 1:1 ratio. Here we remove the unsed `HasCast`, and modify the packet cast tests to better reflect their usage.
*	Implement scalar_cmp_with_cast_op	ShengYang1	2020-06-09
\|
*	Fix static analyzer warning in SelfadjointProduct.h.	Rasmus Munk Larsen	2020-06-08
\| \| \| \|	Fix compiler warnings in GeneralBlockPanelKernel.h.
*	Update FindComputeCpp.cmake to fix build problems on Windows	Thales Sabino	2020-06-05
\| \| \| \| \|	- Use standard types in SYCL/PacketMath.h to avoid compilation problems on Windows - Add EIGEN_HAS_CONSTEXPR to cxx11_tensor_argmax_sycl.cpp to fix build problems on Windows
*	Fix broken packetmath test for logistic on Arm.	Rasmus Munk Larsen	2020-06-04
\|
*	Fix typo in previous update to generic predux_any.	Rasmus Munk Larsen	2020-06-04
\|
*	Avoid implicit float equality comparison in generic predux_any, but use ↵	Rasmus Munk Larsen	2020-06-04
\| \| \| \|	numext::not_equal_strict to avoid breaking builds that compile with -Werror=float-equal.
*	Fix compilation error in logistic packet op.	Rasmus Munk Larsen	2020-06-03
\|
*	Bug #1777: make the scalar and packet path consistent for the logistic ↵	Gael Guennebaud	2020-05-31
\| \| \| \|	function + respective unit test
*	Fix #1833: compilation issue of "array!=scalar" with c++20	Gael Guennebaud	2020-05-30
\|
*	Save one extra temporary when assigning a sparse product to a row-major ↵	Gael Guennebaud	2020-05-30
\| \| \| \|	sparse matrix
*	Add support for PacketBlock<Packet8s,4> and PacketBlock<Packet16uc,4> ↵	Kan Chen	2020-05-29
\| \| \| \|	ptranspose on NEON
*	Fix #1874: it works on both MSVC 2017 and other platforms.	Kan Chen	2020-05-21
\|
*	Add pscatter for Packet16{u}c (int8)	Pedro Caldeira	2020-05-20
\|
*	- Vectorizing MMA packing.	Everton Constantino	2020-05-19
\| \| \| \| \|	- Optimizing MMA kernel. - Adding PacketBlock store to blas_data_mapper.