eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Support manually disabling exceptionsHEAD master	Benjamin Barenblat	2021-07-07
\| \| \| \| \|	Rename EIGEN_EXCEPTIONS to EIGEN_USE_EXCEPTIONS, and allow disabling exceptions with -DEIGEN_USE_EXCEPTIONS=0.
*	Fix NVCC+ICC issues.	Antonio Sanchez	2021-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NVCC does not understand `__forceinline`, so we need to use `inline` when compiling for GPU. ICC specializes `std::complex` operators for `float` and `double` by default, which cannot be used on device and conflict with Eigen's workaround in CUDA/Complex.h. This can be prevented by defining `_OVERRIDE_COMPLEX_SPECIALIZATION_` before including `<complex>`. Added this define to the tests and to `Eigen/Core`, but this will not work if the user includes `<complex>` before `<Eigen/Core>`. ICC also seems to generate a duplicate `Map` symbol in `PlainObjectBase`: ``` error: "Map" has already been declared in the current scope static ConstMapType Map(const Scalar *data) ``` I tracked this down to `friend class Eigen::Map`. Putting the `friend` statements at the bottom of the class seems to resolve this issue. Fixes #2180
*	Fix excessive GEBP register spilling for 32-bit NEON.	Antonio Sanchez	2021-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clang does a poor job of optimizing the GEBP microkernel on 32-bit ARM, leading to excessive 16-byte register spills, slowing down basic f32 matrix multiplication by approx 50%. By specializing `gebp_traits`, we can eliminate the register spills. Volatile inline ASM both acts as a barrier to prevent reordering and enforces strict register use. In a simple f32 matrix multiply example, this modification reduces 16-byte spills from 109 instances to zero, leading to a 1.5x speed increase (search for `16-byte Spill` in the assembly in https://godbolt.org/z/chsPbE). This is a replacement of !379. See there for further discussion. Also moved `gebp_traits` specializations for NEON to `Eigen/src/Core/arch/NEON/GeneralBlockPanelKernel.h` to be alongside other NEON-specific code. Fixes #2138.
*	Add support for Arm SVE	David Tellenbach	2021-01-21
\| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for Arm's new vector extension SVE (Scalable Vector Extension). In contrast to other vector extensions that are supported by Eigen, SVE types are inherently sizeless. For the use in Eigen we fix their size at compile-time (note that this is not necessary in general, SVE is length agnostic). During compilation the flag `-msve-vector-bits=N` has to be set where `N` is a power of two in the range of `128`to `2048`, indicating the length of an SVE vector. Since SVE is rather young, we decided to disable it by default even if it would be available. A user has to enable it explicitly by defining `EIGEN_ARM64_USE_SVE`. This patch introduces the packet types `PacketXf` and `PacketXi` for packets of `float` and `int32_t` respectively. The size of these packets depends on the SVE vector length. E.g. if `-msve-vector-bits=512` is set, `PacketXf` will contain `512/32 = 16` elements. This MR is joint work with Miguel Tairum <miguel.tairum@arm.com>.
*	Drop EIGEN_USING_STD_MATH in favour of EIGEN_USING_STD	David Tellenbach	2020-10-09
\|
*	MatrixProuct enhancements:	Everton Constantino	2020-09-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Changes to Altivec/MatrixProduct Adapting code to gcc 10. Generic code style and performance enhancements. Adding PanelMode support. Adding stride/offset support. Enabling float64, std::complex and std::complex. Fixing lack of symm_pack. Enabling mixedtypes. - Adding std::complex tests to blasutil. - Adding an implementation of storePacketBlock when Incr!= 1.
*	Support BFloat16 in Eigen	Teng Lu	2020-06-20
\|
*	Fix #1757: remove the word 'suicide'	Sebastien Boisvert	2020-06-11
\|
*	Fix #556: warnings with mingw	Gael Guennebaud	2020-05-31
\|
*	Fix incorrect usage of `if defined(EIGEN_ARCH_PPC)` => `if EIGEN_ARCH_PPC`	Yong Tang	2020-05-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR tries to fix an incorrect usage of `if defined(EIGEN_ARCH_PPC)` in `Eigen/Core` header. In `Eigen/src/Core/util/Macros.h`, EIGEN_ARCH_PPC was explicitly defined as either 0 or 1. As a result `if defined(EIGEN_ARCH_PPC)` will always be true. This causes issues when building on non PPC platform and `MatrixProduct.h` is not available. This fix changes `if defined(EIGEN_ARCH_PPC)` => `if EIGEN_ARCH_PPC`. Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
*	- Vectorizing MMA packing.	Everton Constantino	2020-05-19
\| \| \| \| \|	- Optimizing MMA kernel. - Adding PacketBlock store to blas_data_mapper.
*	Eigen moved the `scanLauncehr` function inside the internal namespace.	mehdi-goli	2020-05-11
\| \| \| \| \| \| \|	This commit applies the following changes: - Moving the `scamLauncher` specialization inside internal namespace to fix compiler crash on TensorScan for SYCL backend. - Replacing `SYCL/sycl.hpp` to `CL/sycl.hpp` in order to follow SYCL 1.2.1 standard. - minor fixes: commenting out an unused variable to avoid compiler warnings.
*	Include <sstream> explicitly, and don't rely on the implicit include via ↵	Tobias Bosch	2020-02-24
\| \| \| \| \|	<complex>. This implicit dependency does no longer exist in a recent llbm release (sha 78be61871704).
*	Fix a circular dependency regarding pshift* functions and ↵	Gael Guennebaud	2019-09-06
\| \| \| \| \| \| \|	GenericPacketMathFunctions. Another solution would have been to make pshift* fully generic template functions with partial specialization which is always a mess in c++03.
*	Fix compilation without vector engine available (e.g., x86 with SSE disabled):	Gael Guennebaud	2019-09-05
\| \| \| \|	-> ppolevl is required by ndtri even for the scalar path
*	Fix missing header inclusion and colliding definitions for half type ↵	Rasmus Munk Larsen	2019-08-30
\| \| \| \| \| \|	casting, which broke build with -march=native on Haswell/Skylake.
*	Clean up float16 a.k.a. Eigen::half support in Eigen. Move the definition of ↵	Rasmus Munk Larsen	2019-08-27
\| \| \| \|	half to Core/arch/Default and move arch-specific packet ops to their respective sub-directories.
*	[SYCL] This PR adds the minimum modifications to Eigen core required to run ↵	Mehdi Goli	2019-06-27
\| \| \| \| \| \| \| \|	Eigen unsupported modules on devices supporting SYCL. * Adding SYCL memory model * Enabling/Disabling SYCL backend in Core * Supporting Vectorization
*	Implement AVX512 vectorization of std::complex<float/double>	Gael Guennebaud	2018-12-06
\|
*	temporarily re-disable SSE/AVX vectorization of complex<> on AVX512 -> this ↵	Gael Guennebaud	2018-12-06
\| \| \| \|	needs to be fixed though!
*	Fix pandnot order in AVX512	Gael Guennebaud	2018-11-30
\|
*	Add missing SSE/AVX type-casting in AVX512 mode	Gael Guennebaud	2018-11-28
\|
*	bug #1631: fix compilation with ARM NEON and clang, and cleanup the weird ↵	Gael Guennebaud	2018-11-27
\| \| \| \|	pshiftright_and_cast and pcast_and_shiftleft functions.
*	Unify SSE/AVX psin functions.	Gael Guennebaud	2018-11-27
\| \| \| \| \| \| \| \|	It is based on the SSE version which is much more accurate, though very slightly slower. This changeset also includes the following required changes: - add packet-float to packet-int type traits - add packet float<->int reinterpret casts - add faster pselect for AVX based on blendv
*	Collapsed revision (based on pull request PR-325)	Christian von Schultz	2018-10-22
\| \| \| \| \| \| \|	* Support compiling without IO streams Add the preprocessor definition EIGEN_NO_IO which, if defined, disables all use of the IO streams part of the standard library.
*	bug #65: add vectorization of partial reductions along the outer-dimension, ↵	Gael Guennebaud	2018-10-09
\| \| \| \|	for instance: colmajor_mat.rowwise().mean()
*	bug #231: initial implementation of STL iterators for dense expressions	Gael Guennebaud	2018-10-01
\|
*	merge with default Eigen	Gael Guennebaud	2018-09-21
\|\
\| *	Creating separate SYCL required PR for uncontroversial files.	Mehdi Goli	2018-08-03
\| \|
\| *	Add pcast packet op for NEON.	Rasmus Munk Larsen	2018-07-26
\| \|
\| *	Add MIPS changes missing from previous merge.	Alexey Frunze	2018-07-18
\| \|
\| *	More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA	Gael Guennebaud	2018-07-18
\| \|
\| *	Forward declaring std::array does not work with all std libs, so let's just ↵	Gael Guennebaud	2018-07-13
\| \| \| \| \| \| \| \|	include <array>
\| *	Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate ↵	Gael Guennebaud	2018-07-12
\| \| \| \| \| \| \| \| \| \| \| \|	places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h
\| *	Updates corresponding to the latest round of PR feedback	Deven Desai	2018-07-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.
\| *	merging updates from upstream	Deven Desai	2018-07-11
\| \|\
\| * \|	updates based on PR feedback	Deven Desai	2018-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
\| \| *	Extend CUDA support to matrix inversion and selfadjointeigensolver	Andrea Bocci	2018-06-11
\| \| \|
\| * \|	Adding support for using Eigen in HIP kernels.	Deven Desai	2018-06-06
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
\| *	Define pcast<> for SSE types even when AVX is enabled. (otherwise float are ↵	Gael Guennebaud	2018-05-29
\| \| \| \| \| \| \| \|	silently reinterpreted as int instead of being converted)
\| *	AVX512: _mm512_rsqrt28_ps is available for AVX512ER only	Gael Guennebaud	2018-04-03
\| \|
\| *	MIsc. source and comment typos	luz.paz	2018-03-11
\| \| \| \| \| \| \| \|	Found using `codespell` and `grep` from downstream FreeCAD
\| *	For cuda 9.1 replace math_functions.hpp with cuda_runtime.h	nluehr	2017-12-18
\| \|
\| *	Added support for CUDA 9.0.	Benoit Steiner	2017-08-31
\| \|
\| *	bug #1462: remove all occurences of the deprecated __CUDACC_VER__ macro by ↵	Gael Guennebaud	2017-08-24
\| \| \| \| \| \| \| \|	introducing EIGEN_CUDACC_VER
* \|	merge	Gael Guennebaud	2017-02-21
\|\ \
* \| \|	Add support for automatic-size deduction in reshaped, e.g.:	Gael Guennebaud	2017-02-21
\| \| \| \| \| \| \| \| \| \| \| \|	mat.reshaped(4,AutoSize); <-> mat.reshaped(4,mat.size()/4);
* \| \|	Use fix<> API to specify compile-time reshaped sizes.	Gael Guennebaud	2017-01-29
\| \| \|
* \| \|	Cleanup intitial reshape implementation:	Gael Guennebaud	2017-01-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- reshape -> reshaped - make it compatible with evaluators.
* \| \|	import yoco xiao's work on reshape	Gael Guennebaud	2017-01-29
\|\ \ \