eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Define EIGEN_CPLUSPLUS and replace most __cplusplus checks.	Antonio Sanchez	2021-03-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The macro `__cplusplus` is not defined correctly in MSVC unless building with the the `/Zc:__cplusplus` flag. Instead, it defines `_MSVC_LANG` to the specified c++ standard version number. Here we introduce `EIGEN_CPLUSPLUS` which will contain the c++ version number both for MSVC and otherwise. This simplifies checks for supported features. Also replaced most instances of standard version checking via `__cplusplus` with the existing `EIGEN_COMP_CXXVER` macro for better clarity. Fixes: #2170
*	Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), ↵	David Tellenbach	2021-03-05
\| \| \| \| \| \| \|	innerStride(), outerStride(), and size()" This reverts commit 6cbb3038ac48cb5fe17eba4dfbf26e3e798041f1 because it breaks clang-10 builds on x86 and aarch64 when C++11 is enabled.
*	Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), ↵	Steve Bronder	2021-03-04
\| \| \| \|	outerStride(), and size()
*	Add log2 operation to TensorBase	Eugene Zhulenev	2021-03-04
\|
*	Inherit from `no_assignment_operator` to avoid implicit copy constructor ↵	Christoph Hertzberg	2021-02-27
\| \| \| \| \| \|	warnings (cherry picked from commit 9bbb7ea4b54b1f307863be4ed8d105c38cdefe50)
*	Fix some enum-enum conversion warnings	Christoph Hertzberg	2021-02-27
\| \| \| \|	(cherry picked from commit 838f3d8ce22a5549ef10c7386fb03040721749a0)
*	ReturnByValue is already non-copyable	Christoph Hertzberg	2021-02-27
\| \| \| \|	(cherry picked from commit abbf95045009619f37bd92b45433eedbfcbe41cf)
*	Fix double-promotion warnings	Christoph Hertzberg	2021-02-27
\| \| \| \|	(cherry picked from commit c22c103e932e511e96645186831363585a44b7a3)
*	Idrs iterative linear solver	Jens Wehner	2021-02-27
\|
*	Don't crash when attempting to slice an empty tensor.	Rasmus Munk Larsen	2021-02-24
\|
*	Some improvements for kissfft from Martin Reinecke(pocketfft author):	Guoqiang QI	2021-02-24
\| \| \| \| \| \|	1.Only computing about half of the factors and use complex conjugate symmetry for the rest instead of all to save time. 2.All twiddles are calculated in double because that gives the maximum achievable precision when doing float transforms. 3.Reducing all angles to the range 0<angle<pi/4 which gives even more precision.
*	Eliminate CMake FindPackageHandleStandardArgs warnings.	Antonio Sanchez	2021-02-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CMake complains that the package name does not match when the case differs, e.g.: ``` CMake Warning (dev) at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message): The package name passed to `find_package_handle_standard_args` (UMFPACK) does not match the name of the calling package (Umfpack). This can lead to problems in calling code that expects `find_package` result variables (e.g., `_FOUND`) to follow a certain pattern. Call Stack (most recent call first): cmake/FindUmfpack.cmake:50 (find_package_handle_standard_args) bench/spbench/CMakeLists.txt:24 (find_package) This warning is for project developers. Use -Wno-dev to suppress it. ``` Here we rename the libraries to match their true cases.
*	Add missing adolc isinf/isnan.	Antonio Sanchez	2021-02-19
\| \| \| \| \| \| \|	Also modified cmake/FindAdolc.cmake to eliminate warnings, and added search paths to match install layout. Fixed: #2157
*	Return nan at poles of polygamma, digamma, and zeta if limit is not defined	frgossen	2021-02-19
\|
*	Remove vim specific comments to recognoize correct file-type.	David Tellenbach	2021-02-09
\| \| \| \|	As discussed in #2143 we remove editor specific comments.
*	add specialization of check_sparse_solving() for SuperLU solver, in order to ↵	Ralf Hannemann-Tamas	2021-02-08
\| \| \| \|	test adjoint and transpose solves
*	Include `<cstdint>` in one place, remove custom typedefs	Antonio Sanchez	2021-01-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originating from [this SO issue](https://stackoverflow.com/questions/65901014/how-to-solve-this-all-error-2-in-this-case), some win32 compilers define `__int32` as a `long`, but MinGW defines `std::int32_t` as an `int`, leading to a type conflict. To avoid this, we remove the custom `typedef` definitions for win32. The Tensor module requires C++11 anyways, so we are guaranteed to have included `<cstdint>` already in `Eigen/Core`. Also re-arranged the headers to only include `<cstdint>` in one place to avoid this type of error again.
*	fix test of ExtractVolumePatchesOp	Gmc2	2021-01-25
\|
*	Remove std::cerr in iterative solver since we don't have iostream.	David Tellenbach	2021-01-21
\| \| \| \|	This fixes #2123
*	fix paddings of TensorVolumePatchOp	Maozhou, Ge	2021-01-15
\|
*	Add CUDA complex sqrt.	Antonio Sanchez	2020-12-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is to support scalar `sqrt` of complex numbers `std::complex<T>` on device, requested by Tensorflow folks. Technically `std::complex` is not supported by NVCC on device (though it is by clang), so the default `sqrt(std::complex<T>)` function only works on the host. Here we create an overload to add back the functionality. Also modified the CMake file to add `--relaxed-constexpr` (or equivalent) flag for NVCC to allow calling constexpr functions from device functions, and added support for specifying compute architecture for NVCC (was already available for clang).
*	Replace call to FixedDimensions() with a singleton instance of	Turing Eret	2020-12-16
\| \| \| \|	FixedDimensions.
*	TensorStorage with FixedDimensions now has zero instance memory overhead.	Turing Eret	2020-12-14
\| \| \| \| \| \| \|	Removed m_dimension as instance member of TensorStorage with FixedDimensions and instead use the template parameter. This means that the sizeof a pure fixed-size storage is exactly equal to the data it is storing.
*	Remove code checking for CMake < 3.5	Alexander Grund	2020-12-14
\| \| \| \|	As the CMake version is at least 3.5 the code checking for earlier versions can be removed.
*	Fix bad NEON fp16 check	Antonio Sanchez	2020-12-04
\|
*	Special function implementations for half/bfloat16 packets.	Antonio Sanchez	2020-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	Current implementations fail to consider half-float packets, only half-float scalars. Added specializations for packets on AVX, AVX512 and NEON. Added tests to `special_packetmath`. The current `special_functions` tests would fail for half and bfloat16 due to lack of precision. The NEON tests also fail with precision issues and due to different handling of `sqrt(inf)`, so special functions bessel, ndtri have been disabled. Tested with AVX, AVX512.
*	Clean up the Tensor header and get rid of the EIGEN_SLEEP macro.	Rasmus Munk Larsen	2020-12-02
\|
*	Make inclusion of doc sub-directory optional by adjusting options.	Bowie Owens	2020-11-27
\| \| \| \| \| \| \| \| \| \|	Allows exclusion of doc and related targets to help when using eigen via add_subdirectory(). Requested by: https://gitlab.com/libeigen/eigen/-/issues/1842 Also required making EIGEN_TEST_BUILD_DOCUMENTATION a dependent option on EIGEN_BUILD_DOC. This ensures documentation targets are properly defined when EIGEN_TEST_BUILD_DOCUMENTATION is ON.
*	Fix boolean float conversion and product warnings.	Antonio Sanchez	2020-11-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes some gcc warnings such as: ``` Eigen/src/Core/GenericPacketMath.h:655:63: warning: implicit conversion turns floating-point number into bool: 'typename __gnu_cxx::__enable_if<__is_integer<bool>::__value, double>::__type' (aka 'double') to 'bool' [-Wimplicit-conversion-floating-point-to-bool] Packet psqrt(const Packet& a) { EIGEN_USING_STD(sqrt); return sqrt(a); } ``` Details: - Added `scalar_sqrt_op<bool>` (`-Wimplicit-conversion-floating-point-to-bool`). - Added `scalar_square_op<bool>` and `scalar_cube_op<bool>` specializations (`-Wint-in-bool-context`) - Deprecated above specialized ops for bool. - Modified `cxx11_tensor_block_eval` to specialize generator for booleans (`-Wint-in-bool-context`) and to use `abs` instead of `square` to avoid deprecated bool ops.
*	Fix sparse_extra_3, disable counting temporaries for testing ↵	Antonio Sanchez	2020-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DynamicSparseMatrix. Multiplication of column-major `DynamicSparseMatrix`es involves three temporaries: - two for transposing twice to sort the coefficients (`ConservativeSparseSparseProduct.h`, L160-161) - one for a final copy assignment (`SparseAssign.h`, L108) The latter is avoided in an optimization for `SparseMatrix`. Since `DynamicSparseMatrix` is deprecated in favor of `SparseMatrix`, it's not worth the effort to optimize further, so I simply disabled counting temporaries via a macro. Note that due to the inclusion of `sparse_product.cpp`, the `sparse_extra` tests actually re-run all the original `sparse_product` tests as well. We may want to simply drop the `DynamicSparseMatrix` tests altogether, which would eliminate the test duplication. Related to #2048
*	Add bit_cast for half/bfloat to/from uint16_t, fix TensorRandom	Antonio Sanchez	2020-11-18
\| \| \| \| \| \| \| \| \| \|	The existing `TensorRandom.h` implementation makes the assumption that `half` (`bfloat16`) has a `uint16_t` member `x` (`value`), which is not always true. This currently fails on arm64, where `x` has type `__fp16`. Added `bit_cast` specializations to allow casting to/from `uint16_t` for both `half` and `bfloat16`. Also added tests in `half_float`, `bfloat16_float`, and `cxx11_tensor_random` to catch these errors in the future.
*	Fix rule-of-3 for the Tensor module.	Antonio Sanchez	2020-11-18
\| \| \| \| \| \| \|	Adds copy constructors to Tensor ops, inherits assignment operators from `TensorBase`. Addresses #1863
*	Disable testing of OpenGL by default.	Antonio Sanchez	2020-11-12
\| \| \| \| \| \| \| \| \| \| \| \|	The `OpenGLSupport` module contains mostly deprecated features, and the test is highly GL context-dependent, relies on deprecated GLUT, and requires a display. Until the module is updated to support modern OpenGL and the test to use newer windowing frameworks (e.g. GLFW) it's probably best to disable the test by default. The test can be enabled with `cmake -DEIGEN_TEST_OPENGL=ON`. See #2053 for more details.
*	Address issues with `openglsupport` test.	Antonio Sanchez	2020-11-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing test fails on several systems due to GL runtime version mismatches, the use of deprecated features, and memory errors due to improper use of GLUT. The test was modified to: - Run within a display function, allowing proper GLUT cleanup. - Generate dynamic shaders with a supported GLSL version string and output variables. - Report shader compilation errors. - Check GL context version before launching version-specific tests. Note that most of the existing `OpenGLSupport` module and tests rely on deprecated features (e.g. fixed-function pipeline). The test was modified to allow it to pass on various systems. We might want to consider removing the module or re-writing it entirely to support modern OpenGL. This is beyond the scope of this patch. Testing of legacy GL (for platforms that support it) can be enabled by defining `EIGEN_LEGACY_OPENGL`. Otherwise, the test will try to create a modern context. Tested on - MacBook Air (2019), macOS Catalina 10.15.7 (OpenGL 2.1, 4.1) - Debian 10.6, NVidia Quadro K1200 (OpenGL 3.1, 3.3)
*	CMakefile update for ROCm 4.0	Deven Desai	2020-10-29
\| \| \| \|	Starting with ROCm 4.0, the `hipconfig --platform` command will return `amd` (prior return value was `hcc`). Updating the CMakeLists.txt files in the test dirs to account for this change.
*	[SYCL clean up the code] : removing exrta #pragma unroll in SYCL which was ↵	mehdi-goli	2020-10-28
\| \| \| \|	causing issues in embeded systems
*	Remove leftover debug print statement in cxx11_tensor_expr.cpp	Rasmus Munk Larsen	2020-10-14
\|
*	Get rid of nested template specialization in TensorReductionGpu.h, which was ↵	Rasmus Munk Larsen	2020-10-13
\| \| \| \|	broken by c6953f799b01d36f4236b64f351cc1446e0abe17.
*	Add packet generic ops `predux_fmin`, `predux_fmin_nan`, `predux_fmax`, and ↵	Rasmus Munk Larsen	2020-10-13
\| \| \| \|	`predux_fmax_nan` that implement reductions with `PropagateNaN`, and `PropagateNumbers` semantics. Add (slow) generic implementations for most reductions.
*	Add EIGEN prefix for HAS_LGAMMA_R	David Tellenbach	2020-10-08
\|
*	Use lgamma_r if it is available (update check for glibc 2.19+)	Eugene Zhulenev	2020-10-08
\|
*	Don't make assumptions about NaN-propagation for pmin/pmax - it various ↵	Rasmus Munk Larsen	2020-10-07
\| \| \| \| \| \|	across platforms. Change test to only test for NaN-propagation for pfmin/pfmax.
*	Fix Eigen::ThreadPool::CurrentThreadId returning wrong thread id when ↵	Zhuyie	2020-09-25
\| \| \| \|	EIGEN_AVOID_THREAD_LOCAL and NDEBUG are defined
*	Get rid of initialization logic for blueNorm by making the computed ↵	Rasmus Munk Larsen	2020-09-18
\| \| \| \| \| \|	constants static const or constexpr. Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.
*	Fixing a CUDA / P100 regression introduced by PR 181	Deven Desai	2020-08-20
\| \| \| \| \| \|	PR 181 ( https://gitlab.com/libeigen/eigen/-/merge_requests/181 ) adds `__launch_bounds__(1024)` attribute to GPU kernels, that did not have that attribute explicitly specified. That PR seems to cause regressions on the CUDA platform. This PR/commit makes the changes in PR 181, to be applicable for HIP only
*	Disable min/max NaN propagation in test cxx11_tensor_expr	David Tellenbach	2020-08-14
\| \| \| \| \| \| \|	The current pmin/pmax implementation for Arm Neon propagate NaNs differently than std::min/std::max. See issue https://gitlab.com/libeigen/eigen/-/issues/1937
*	Adding an explicit launch_bounds(1024) attribute for GPU kernels.	Deven Desai	2020-08-05
\| \| \| \| \| \| \| \| \| \|	Starting with ROCm 3.5, the HIP compiler will change from HCC to hip-clang. This compiler change introduce a change in the default value of the `__launch_bounds__` attribute associated with a GPU kernel. (default value means the value assumed by the compiler as the `__launch_bounds attribute__` value, when it is not explicitly specified by the user) Currently (i.e. for HIP with ROCm 3.3 and older), the default value is 1024. That changes to 256 with ROCm 3.5 (i.e. hip-clang compiler). As a consequence of this change, if a GPU kernel with a `__luanch_bounds__` attribute of 256 is launched at runtime with a threads_per_block value > 256, it leads to a runtime error. This is leading to a couple of Eigen unit test failures with ROCm 3.5. This commit adds an explicit `__launch_bounds(1024)__` attribute to every GPU kernel that currently does not have it explicitly specified (and hence will end up getting the default value of 256 with the change to hip-clang)
*	Inherit alignment trait from argument in TensorBroadcasting to avoid ↵	Rasmus Munk Larsen	2020-07-28
\| \| \| \|	segfault when the argument is unaligned.
*	Update tensor reduction test to avoid undefined division of bfloat16 by int.	Rasmus Munk Larsen	2020-07-22
\|
*	Fix tensor casts for large packets and casts to/from std::complex	Antonio Sanchez	2020-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \|	The original tensor casts were only defined for `SrcCoeffRatio`:`TgtCoeffRatio` 1:1, 1:2, 2:1, 4:1. Here we add the missing 1:N and 8:1. We also add casting `Eigen::half` to/from `std::complex<T>`, which was missing to make it consistent with `Eigen:bfloat16`, and generalize the overload to work for any complex type. Tests were added to `basicstuff`, `packetmath`, and `cxx11_tensor_casts` to test all cast configurations.