eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Add support for Arm SVE	David Tellenbach	2021-01-21
\| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for Arm's new vector extension SVE (Scalable Vector Extension). In contrast to other vector extensions that are supported by Eigen, SVE types are inherently sizeless. For the use in Eigen we fix their size at compile-time (note that this is not necessary in general, SVE is length agnostic). During compilation the flag `-msve-vector-bits=N` has to be set where `N` is a power of two in the range of `128`to `2048`, indicating the length of an SVE vector. Since SVE is rather young, we decided to disable it by default even if it would be available. A user has to enable it explicitly by defining `EIGEN_ARM64_USE_SVE`. This patch introduces the packet types `PacketXf` and `PacketXi` for packets of `float` and `int32_t` respectively. The size of these packets depends on the SVE vector length. E.g. if `-msve-vector-bits=512` is set, `PacketXf` will contain `512/32 = 16` elements. This MR is joint work with Miguel Tairum <miguel.tairum@arm.com>.
*	Add support for Armv8.2-a __fp16	David Tellenbach	2020-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Armv8.2-a provides a native half-precision floating point (__fp16 aka. float16_t). This patch introduces * __fp16 as underlying type of Eigen::half if this type is available * the packet types Packet4hf and Packet8hf representing float16x4_t and float16x8_t respectively * packet-math for the above packets with corresponding scalar type Eigen::half The packet-math functionality has been implemented by Ashutosh Sharma <ashutosh.sharma@amperecomputing.com>. This closes #1940.
*	Support BFloat16 in Eigen	Teng Lu	2020-06-20
\|
*	Fixing HIP breakage caused by the recent commit that introduces Packet4h2 as ↵	Deven Desai	2020-03-12
\| \| \| \|	the Eigen::Half packet type
*	Merged in ↵	Rasmus Larsen	2019-12-04
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	anshuljl/eigen-2/Anshul-Jaiswal/update-configurevectorizationh-to-not-op-1573079916090 (pull request PR-754) Update ConfigureVectorization.h to not optimize fp16 routines when compiling with cuda. Approved-by: Deven Desai <deven.desai.amd@gmail.com>
* \|	[SYCL] Rebasing the SYCL support branch on top of the Einge upstream master ↵	Mehdi Goli	2019-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake
\| *	Update ConfigureVectorization.h to not optimize fp16 routines when compiling ↵	Anshul Jaiswal	2019-11-06
\| \| \| \| \| \| \| \|	with cuda.
* \|	Disable AVX on broken xcode versions. See PR 748.	Gael Guennebaud	2019-11-12
\|/ \| \| \|	Patch adapted from Hans Johnson's PR 748.
*	Add workaround for choosing the right include files with FP16C support with ↵	Rasmus Munk Larsen	2019-06-05
\| \| \| \|	clang.
*	Clean up CUDA/NVCC version macros and their use in Eigen, and a few other ↵	Rasmus Munk Larsen	2019-05-31
\| \| \| \|	CUDA build failures.
*	Enable support for F16C with Clang. The required intrinsics were added here: ↵	Rasmus Munk Larsen	2019-05-20
\| \| \| \| \| \|	https://reviews.llvm.org/D16177 and are part of LLVM 3.8.0.
*	updates requested in the PR feedback. Also droping coded within #ifdef ↵	Deven Desai	2019-03-19
\| \| \| \|	EIGEN_HAS_OLD_HIP_FP16
*	bug #1678: Fix lack of __FMA__ macro on MSVC with AVX512	Gael Guennebaud	2019-02-15
\|
*	Replace host_define.h with cuda_runtime_api.h	nluehr	2019-01-18
\|
*	Replace compiler's alignas/alignof extension by respective c++11 keywords ↵	Gael Guennebaud	2019-01-11
\| \| \| \|	when available. This also fix a compilation issue with gcc-4.7.
*	Enable FMA with MSVC (through /arch:AVX2). To make this possible, I also has ↵	Gael Guennebaud	2018-12-07
\| \| \| \|	to turn the #warning regarding AVX512-FMA to a #error.
*	bug #1638: add a warning if avx512 is enabled without SSE/AVX FMA	Gael Guennebaud	2018-12-07
\|
*	#elif -> #else to fix GPU build.	Rasmus Munk Larsen	2018-12-05
\|
*	Update checks in ConfigureVectorization.h	Eugene Zhulenev	2018-12-03
\|
*	Do not disable alignment with EIGEN_GPUCC	Eugene Zhulenev	2018-12-03
\|
*	Fix typo in comment on EIGEN_MAX_STATIC_ALIGN_BYTES	Nikolaus Demmel	2018-11-14
\|
*	This commit contains the following (HIP specific) updates:	Deven Desai	2018-10-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- unsupported/Eigen/CXX11/src/Tensor/TensorReductionGpu.h Changing "pass-by-reference" argument to be "pass-by-value" instead (in a __global__ function decl). "pass-by-reference" arguments to __global__ functions are unwise, and will be explicitly flagged as errors by the newer versions of HIP. - Eigen/src/Core/util/Memory.h - unsupported/Eigen/CXX11/src/Tensor/TensorContraction.h Changes introduced in recent commits breaks the HIP compile. Adding EIGEN_DEVICE_FUNC attribute to some functions and calling ::malloc/free instead of the corresponding std:: versions to get the HIP compile working again - unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h Change introduced a recent commit breaks the HIP compile (link stage errors out due to failure to inline a function). Disabling the recently introduced code (only for HIP compile), to get the eigen nightly testing going again. Will submit another PR once we have te proper fix. - Eigen/src/Core/util/ConfigureVectorization.h Enabling GPU VECTOR support when HIP compiler is in use (for both the host and device compile phases)
*	Provide EIGEN_ALIGNOF macro, and give handmade_aligned_malloc the ↵	Christoph Hertzberg	2018-09-14
\| \| \| \|	possibility for alignments larger than the standard alignment.
*	Add MIPS changes missing from previous merge.	Alexey Frunze	2018-07-18
\|
*	Add support for MIPS SIMD (MSA)	Alexey Frunze	2018-07-06
\|
*	Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate ↵	Gael Guennebaud	2018-07-12
	places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h