eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Replace numext::as_uint with numext::bit_cast<numext::uint32_t>	David Tellenbach	2020-10-29
\|
*	Add support for Armv8.2-a __fp16	David Tellenbach	2020-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Armv8.2-a provides a native half-precision floating point (__fp16 aka. float16_t). This patch introduces * __fp16 as underlying type of Eigen::half if this type is available * the packet types Packet4hf and Packet8hf representing float16x4_t and float16x8_t respectively * packet-math for the above packets with corresponding scalar type Eigen::half The packet-math functionality has been implemented by Ashutosh Sharma <ashutosh.sharma@amperecomputing.com>. This closes #1940.
*	[Missing SYCL math op]: Addin the missing LDEXP Function for SYCL.	mehdi-goli	2020-10-28
\|
*	[Fixing expf issue]: Eigen uses the packet type operation for scaler type ↵	mehdi-goli	2020-10-28
\| \| \| \|	float on Sigmoid function(https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/functors/UnaryFunctors.h#L990). As a result SYCL backend breaks since SYCL backend only supports packet operation for vectorized type float4 and double2. The issue has been fixed by adding scalar type float to packet operation pexp for SYCL backend.
*	Improve polynomial evaluation with instruction-level parallelism for ↵	guoqiangqi	2020-10-20
\| \| \| \|	pexp_float and pexp<Packet16f>
*	remove unnecessary specialize template of pexp for scale float/double	guoqiangqi	2020-10-19
\|
*	Fix missing `pfirst<Packet16b>` for MSVC.	Antonio Sanchez	2020-10-16
\| \| \| \| \|	It was only defined under one `#ifdef` case. This fixes the `packetmath_14` test for MSVC.
*	Fix the specialization of pfrexp for AVX to be faster when AVX2/AVX512DQ is ↵	Rasmus Munk Larsen	2020-10-15
\| \| \| \|	not available, and avoid undefined behavior in C++. Also mask off the sign bit when extracting the exponent.
*	Revert change from 4e4d3f32d168ed9ce09d950f099a60ddcd11240f that broke ↵	Rasmus Munk Larsen	2020-10-15
\| \| \| \|	BFloat16.h build with older compilers.
*	Add AVX plog<Packet4d> and AVX512 plog<Packet8d> ops,also unified AVX512 ↵	Guoqiang QI	2020-10-15
\| \| \| \|	plog<Packet16f> op with generic api
*	Add specializations for pmin/pmax with prescribed NaN propagation semantics ↵	Rasmus Munk Larsen	2020-10-14
\| \| \| \|	for SSE/AVX/AVX512.
*	undefine EIGEN_CONSTEXPR before redefinition	acxz	2020-10-12
\|
*	Clean up packetmath tests and fix various bugs to make bfloat16 pass ↵	Rasmus Munk Larsen	2020-10-09
\| \| \| \|	(almost) all packetmath tests with SSE, AVX, and AVX512.
*	Use EIGEN_USING_STD to fix CUDA compilation error on BFloat16.h.	Rasmus Munk Larsen	2020-10-02
\|
*	Fix build breakage with MSVC 2019, which does not support MMX intrinsics for ↵	Rasmus Munk Larsen	2020-10-01
\| \| \| \| \| \| \| \|	64 bit builds, see: https://stackoverflow.com/questions/60933486/mmx-intrinsics-like-mm-cvtpd-pi32-not-found-with-msvc-2019-for-64bit-targets-c Instead use the equivalent SSE2 intrinsics.
*	Specialize pldexp_double and pfdexp_double and get rid of Packet2l ↵	Rasmus Munk Larsen	2020-09-30
\| \| \| \|	definition for SSE. SSE does not support conversion between 64 bit integers and double and the existing implementation of casting between Packet2d and Packer2l results in undefined behavior when casting NaN to int. Since pldexp and pfdexp only manipulate exponent fields that fit in 32 bit, this change provides specializations that use existing instructions _mm_cvtpd_pi32 and _mm_cvtsi32_pd instead.
*	Fix compilation of 64 bit constant arguments to pset1frombits in ↵	Rasmus Munk Larsen	2020-09-28
\| \| \| \|	TypeCasting.h on platforms where uint64_t != unsigned long.
*	Fix compilation of pset1frombits calls on iOS.	Rasmus Munk Larsen	2020-09-28
\|
*	Provide a more efficient Packet2l->Packet2d cast method	Christoph Hertzberg	2020-09-28
\|
*	Fix for ROCm/HIP breakage - 200921	Deven Desai	2020-09-22
\| \| \| \| \| \| \| \| \| \| \| \| \|	The following commit causes regressions in the ROCm/HIP support for Eigen https://gitlab.com/libeigen/eigen/-/commit/e55182ac09885d7558adf75e9e230b051a721c18 I suspect the same breakages occur on the CUDA side too. The above commit puts the EIGEN_CONSTEXPR attribute on `half_base` constructor. `half_base` is derived from `__half_raw`. When compiling with GPU support, the definition of `__half_raw` gets picked up from the GPU Compiler specific header files (`hip_fp16.h`, `cuda_fp16.h`). Properly supporting the above commit would require adding the `constexpr` attribute to the `__half_raw` constructor (and other `half` routines) in those header files. While that is something we can explore in the future, for now we need to undo the above commit when compiling with GPU support, which is what this commit does. This commit also reverts a small change in the `raw_uint16_to_half` routine made by the above commit. Similar to the case above, that change was leading to compile errors due to the fact that `__half_raw` has a different definition when compiling with DPU support.
*	Fix the #issue1997 and #issue1991 bug triggered by unsupport a[index](type ↵	Guoqiang QI	2020-09-21
\| \| \| \|	a: __i28d) ops with MSVC compiler
*	Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being ↵	Rasmus Munk Larsen	2020-09-18
\| \| \| \| \| \|	available on 32 bit x86. If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.
*	Fix undefined reference to pset1frombits bug on different platforms	guoqiangqi	2020-09-19
\|
*	Get rid of initialization logic for blueNorm by making the computed ↵	Rasmus Munk Larsen	2020-09-18
\| \| \| \| \| \|	constants static const or constexpr. Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.
*	Fix more mildly embarrassing typos in ARM intrinsics in PacketMath.h.	Rasmus Munk Larsen	2020-09-18
\| \| \|	'vmvnq_u64' does not exist for some reason.
*	Fix typo in PacketMath.h	Rasmus Munk Larsen	2020-09-18
\|
*	Add missing packet op pcmp_lt_or_nan for Packet2d on ARM.	Rasmus Munk Larsen	2020-09-18
\|
*	Add support for CastXML on ARM aarch64	Brad King	2020-09-16
\| \| \| \| \| \| \| \|	CastXML simulates the preprocessors of other compilers, but actually parses the translation unit with an internal Clang compiler. Use the same `vld1q_u64` workaround that we do for Clang. Fixes: #1979
*	Remove old Clang compiler bug work-arounds. The two LLVM bugs referenced in ↵	Benoit Jacob	2020-09-15
\| \| \| \|	the comments here have long been fixed. The workarounds were now detrimental because (1) they prevented using fused mul-add on Clang/ARM32 and (2) the unnecessary 'volatile' in 'asm volatile' prevented legitimate reordering by the compiler.
*	Make bfloat16(float(-nan)) produce -nan, not nan.	Tim Shen	2020-09-15
\|
*	Add plog ops support packet2d for NEON	Guoqiang QI	2020-09-15
\|
*	Unified sse pldexp_double api	Guoqiang QI	2020-09-12
\|
*	Fix half_impl::float_to_half_rtne(float) warning: '<<' causes overflow	Niels Dekker	2020-09-10
\| \| \| \| \| \|	Fixed Visual Studio 2019 Code Analysis (C++ Core Guidelines) warning C26450 from inside `half_impl::float_to_half_rtne(float)`: > Arithmetic overflow: '<<' operation causes overflow at compile time.
*	Add missing functions for Packet8bf in Altivec architecture.	Pedro Caldeira	2020-09-08
\| \| \| \| \|	Including new tests for bfloat16 Packets. Fix prsqrt on GenericPacketMath.
*	Add Neon psqrt<Packet2d> and pexp<Packet2d>	Guoqiang QI	2020-09-08
\|
*	MatrixProuct enhancements:	Everton Constantino	2020-09-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Changes to Altivec/MatrixProduct Adapting code to gcc 10. Generic code style and performance enhancements. Adding PanelMode support. Adding stride/offset support. Enabling float64, std::complex and std::complex. Fixing lack of symm_pack. Enabling mixedtypes. - Adding std::complex tests to blasutil. - Adding an implementation of storePacketBlock when Incr!= 1.
*	Changing u/int8_t to un/signed char because clang does not understand	Everton Constantino	2020-09-02
\| \| \| \| \| \|	it. Implementing pcmp_eq to Packet8 and Packet16.
*	Change Packet8s and Packet8us to use vector commands on Power for pmadd, ↵	Chip Kerchner	2020-08-28
\| \| \| \|	pmul and psub.
*	add psqrt ops support packet2f/packet4f for NEON	Guoqiang QI	2020-08-21
\|
*	bfloat16 packetmath for Arm Neon backend	David Tellenbach	2020-08-13
\|
*	Add support for Bfloat16 to use vector instructions on Altivec	Pedro Caldeira	2020-08-10
\| \| \| \|	architecture
*	Temporarily turn off the NEON implementation of pfloor as it does not work ↵	Zachary Garrett	2020-08-04
\| \| \| \| \| \|	for large values. The NEON implementation mimics the SSE implementation, but didn't mention the caveat that due to the unsigned of signed integer conversions, not all values in the original floating point represented are supported.
*	Fix undefine BF16 union behavior in AVX512.	Teng Lu	2020-07-29
\|
*	Fix clang-tidy warnings in generic bfloat16 implementation	David Tellenbach	2020-07-27
\| \| \| \|	See !172 for related discussions.
*	Fix bfloat16 casts	David Tellenbach	2020-07-23
\| \| \| \| \| \| \|	If we have explicit conversion operators available (C++11) we define explicit casts from bfloat16 to other types. If not (C++03), we don't define conversion operators but rely on implicit conversion chains from bfloat16 over float to other types.
*	Revert change that made conversion from bfloat16 to {float, double} implicit.	Rasmus Munk Larsen	2020-07-22
\| \| \| \|	Add roundtrip tests for casting between bfloat16 and complex types.
*	Fix cast of blfoat16 to std::complex<T>	David Tellenbach	2020-07-22
\| \| \| \|	This fixes https://gitlab.com/libeigen/eigen/-/issues/1951
*	Make sure we take the little-endian path if __BYTE_ORDER__ is not defined.	Rasmus Munk Larsen	2020-07-22
\|
*	Faster conversion from integer types to bfloat16	Niels Dekker	2020-07-22
\| \| \| \| \| \|	Specialized `bfloat16_impl::float_to_bfloat16_rtne(float)` for normal floating point numbers, infinity and zero, in order to improve the performance of `bfloat16::bfloat16(const T&)` for integer argument types. A reduction of more than 20% of the runtime duration of conversion from int to bfloat16 was observed, using Visual C++ 2019 on Windows 10.
*	Avoid undefined behavior by union type punning in float_to_bfloat16_rtne	Niels Dekker	2020-07-14
\| \| \| \| \| \| \| \|	Use `numext::as_uint`, instead of union based type punning, to avoid undefined behavior. See also C++ Core Guidelines: "Don't use a union for type punning" https://github.com/isocpp/CppCoreGuidelines/blob/v0.8/CppCoreGuidelines.md#c183-dont-use-a-union-for-type-punning `numext::as_uint` was suggested by David Tellenbach