eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Implement a generic vectorized version of Smith's algorithms for complex ↵	Rasmus Munk Larsen	2021-07-01
\| \| \| \|	division.
*	Create the ability to disable the specialized gemm_pack_rhs in Eigen (only ↵	Chip Kerchner	2021-06-30
\| \| \| \|	PPC) for TensorFlow
*	Small cleanup: Get rid of the macros EIGEN_HAS_SINGLE_INSTRUCTION_CJMADD and ↵	Rasmus Munk Larsen	2021-06-24
\| \| \| \|	CJMADD, which were effectively unused, apart from on x86, where the change results in identically performing code.
*	Get rid of code duplication for conj_helper. For packets where ↵	Rasmus Munk Larsen	2021-06-24
\| \| \| \|	LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations.
*	EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X ↵	Chip-Kerchner	2021-06-16
\| \| \| \|	slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate.
*	Add missing ppc pcmp_lt_or_nan<Packet8bf>	Antonio Sanchez	2021-06-15
\|
*	Use bit_cast to create -0.0 for floating point types to avoid compiler ↵	Rasmus Munk Larsen	2021-06-11
\| \| \| \|	optimization changing sign with --ffast-math enabled.
*	Fix taking address of rvalue compiler issue with TensorFlow (plus other ↵	Chip-Kerchner	2021-04-21
\| \| \| \|	warnings).
*	Fix address of temporary object errors in clang11.	Chip Kerchner	2021-04-02
\| \| \| \|	This fixes the problem with taking the address of temporary objects which clang11 treats as errors.
*	Fixed performance issues for complex VSX and P10 MMA in gebp_kernel (level 3).	Chip Kerchner	2021-03-25
\|
*	Fix pround and add print	Chip Kerchner	2021-03-15
\|
*	Make half/bfloat16 constructor take inputs by value, fix powerpc test.	Antonio Sanchez	2021-02-27
\| \| \| \| \| \| \| \| \| \| \| \|	Since `numeric_limits<half>::max_exponent` is a static inline constant, it cannot be directly passed by reference. This triggers a linker error in recent versions of `g++-powerpc64le`. Changing `half` to take inputs by value fixes this. Wrapping `max_exponent` with `int(...)` to make an addressable integer also fixes this and may help with other custom `Scalar` types down-the-road. Also eliminated some compile warnings for powerpc.
*	Fix clang compile when no MMA flags are set. Simplify MMA compiler detection.	Chip-Kerchner	2021-02-24
\|
*	Having forward template function declarations in a P10 file causes bad code ↵	Chip-Kerchner	2021-02-24
\| \| \| \|	in certain situations.
*	Fixes to support old and new versions of the compilers for built-ins. Cast ↵	Chip-Kerchner	2021-02-24
\| \| \| \|	to non-const when using vector_pair with certain built-ins.
*	Fix compilation errors with later versions of GCC and use of MMA.	Chip-Kerchner	2021-02-22
\|
*	Fixed performance issues for VSX and P10 MMA in general_matrix_matrix_product	Chip Kerchner	2021-02-17
\|
*	Updated pfrexp implementation.	Antonio Sanchez	2021-02-17
\| \| \| \| \| \|	The original implementation fails for 0, denormals, inf, and NaN. See #2150
*	Fix ldexp implementations.	Antonio Sanchez	2021-02-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous implementations produced garbage values if the exponent did not fit within the exponent bits. See #2131 for a complete discussion, and !375 for other possible implementations. Here we implement the 4-factor version. See `pldexp_impl` in `GenericPacketMathFunctions.h` for a full description. The SSE `pcmp*` methods were moved down since `pcmp_le<Packet4i>` requires `por`. Left as a "TODO" is to delegate to a faster version if we know the exponent does fit within the exponent bits. Fixes #2131.
*	Eliminate implicit conversions from float to double.	Antonio Sanchez	2021-02-01
\|
*	Fix altivec packetmath.	Antonio Sanchez	2021-01-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allows the altivec packetmath tests to pass. There were a few issues: - `pstoreu` was missing MSQ on `_BIG_ENDIAN` systems - `cmp_*` didn't properly handle conversion of bool flags (0x7FC instead of 0xFFFF) - `pfrexp` needed to set the `exponent` argument. Related to !370, #2128 cc: @ChipKerchner @pdrocaldeira Tested on `_BIG_ENDIAN` running on QEMU with VSX. Couldn't figure out build flags to get it to work for little endian.
*	Fix clang compilation for AltiVec from previous check-in	Chip Kerchner	2021-01-28
\|
*	Fix sqrt, ldexp and frexp compilation errors.	Chip Kerchner	2021-01-25
\|
*	Add support for dynamic dispatch of MMA instructions for POWER 10	Pedro Caldeira	2020-11-12
\|
*	Add missing functions for Packet8bf in Altivec architecture.	Pedro Caldeira	2020-09-08
\| \| \| \| \|	Including new tests for bfloat16 Packets. Fix prsqrt on GenericPacketMath.
*	MatrixProuct enhancements:	Everton Constantino	2020-09-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Changes to Altivec/MatrixProduct Adapting code to gcc 10. Generic code style and performance enhancements. Adding PanelMode support. Adding stride/offset support. Enabling float64, std::complex and std::complex. Fixing lack of symm_pack. Enabling mixedtypes. - Adding std::complex tests to blasutil. - Adding an implementation of storePacketBlock when Incr!= 1.
*	Changing u/int8_t to un/signed char because clang does not understand	Everton Constantino	2020-09-02
\| \| \| \| \| \|	it. Implementing pcmp_eq to Packet8 and Packet16.
*	Change Packet8s and Packet8us to use vector commands on Power for pmadd, ↵	Chip Kerchner	2020-08-28
\| \| \| \|	pmul and psub.
*	Add support for Bfloat16 to use vector instructions on Altivec	Pedro Caldeira	2020-08-10
\| \| \| \|	architecture
*	Fix pscatter and pgather for Altivec Complex double	Pedro Caldeira	2020-06-16
\|
*	Add pscatter for Packet16{u}c (int8)	Pedro Caldeira	2020-05-20
\|
*	- Vectorizing MMA packing.	Everton Constantino	2020-05-19
\| \| \| \| \|	- Optimizing MMA kernel. - Adding PacketBlock store to blas_data_mapper.
*	Altivec template functions to better code reusability	Pedro Caldeira	2020-05-11
\|
*	Remove unused packet op "palign".	Rasmus Munk Larsen	2020-05-07
\| \| \| \|	Clean up a compiler warning in c++03 mode in AVX512/Complex.h.
*	Add support to vector instructions to Packet16uc and Packet16c	Pedro Caldeira	2020-04-27
\|
*	Remove unused packet op "preduxp".	Rasmus Munk Larsen	2020-04-23
\|
*	Add Packet8s and Packet8us to support signed/unsigned int16/short Altivec ↵	Pedro Caldeira	2020-04-21
\| \| \| \|	vector operations
*	Adhere to recommended load/store intrinsics for pp64le	Everton Constantino	2020-03-23
\|
*	Fixing float32's pround halfway criteria to match STL's criteria.	Everton Constantino	2020-03-21
\|
*	Add shift_left<N> and shift_right<N> coefficient-wise unary Array functions	Joel Holdsworth	2020-03-19
\|
*	Switching unpacket_traits<Packet4i> to vectorizable=true.	Everton Constantino	2020-01-13
\|
*	Move implementation of vectorized error function erf() to ↵	Rasmus Munk Larsen	2019-09-27
\| \| \| \|	SpecialFunctionsImpl.h.
*	Add generic PacketMath implementation of the Error Function (erf).	Rasmus Munk Larsen	2019-09-19
\|
*	Fix compilation without vector engine available (e.g., x86 with SSE disabled):	Gael Guennebaud	2019-09-05
\| \| \| \|	-> ppolevl is required by ndtri even for the scalar path
*	Fix debug macros in p{load,store}u	João P. L. de Carvalho	2019-08-14
\|
*	Add missing pcmp_XX methods for double/Packet2d	João P. L. de Carvalho	2019-08-14
\| \| \| \|	This actually fixes an issue in unit-test packetmath_2 with pcmp_eq when it is compiled with clang. When pcmp_eq(Packet4f,Packet4f) is used instead of pcmp_eq(Packet2d,Packet2d), the unit-test does not pass due to NaN on ref vector.
*	Fix packed load/store for PowerPC's VSX	João P. L. de Carvalho	2019-08-09
\| \| \| \| \| \| \| \|	The vec_vsx_ld/vec_vsx_st builtins were wrongly used for aligned load/store. In fact, they perform unaligned memory access and, even when the address is 16-byte aligned, they are much slower (at least 2x) than their aligned counterparts. For double/Packet2d vec_xl/vec_xst should be prefered over vec_ld/vec_st, although the latter works when casted to float/Packet4f. Silencing some weird warning with throw but some GCC versions. Such warning are not thrown by Clang.
*	Fix offset argument of ploadu/pstoreu for Altivec	João P. L. de Carvalho	2019-08-09
\| \| \| \| \| \| \| \| \| \|	If no offset is given, them it should be zero. Also passes full address to vec_vsx_ld/st builtins. Removes userless _EIGEN_ALIGNED_PTR & _EIGEN_MASK_ALIGNMENT. Removes unnecessary casts.
*	bug #1718: Add cast to successfully compile with clang on PowerPC	João P. L. de Carvalho	2019-08-09
\| \| \| \|	Ignoring -Wc11-extensions warnings thrown by clang at Altivec/PacketMath.h
*	Add masked_store_available to unpacket_traits	Eugene Zhulenev	2019-05-02
\|