eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Fix some enum-enum conversion warnings	Christoph Hertzberg	2021-02-27
\| \| \| \|	(cherry picked from commit 838f3d8ce22a5549ef10c7386fb03040721749a0)
*	Fixed/masked more implicit copy constructor warnings	Christoph Hertzberg	2021-02-27
\| \| \| \|	(cherry picked from commit 2883e91ce5a99c391fbf28e20160176b70854992)
*	Fix double-promotion warnings	Christoph Hertzberg	2021-02-27
\| \| \| \|	(cherry picked from commit c22c103e932e511e96645186831363585a44b7a3)
*	Fix NEON sqrt for 32-bit, add prsqrt.	Antonio Sanchez	2021-02-26
\| \| \| \| \| \| \| \| \| \| \| \|	With !406, we accidentally broke arm 32-bit NEON builds, since `vsqrt_f32` is only available for 64-bit. Here we add back the `rsqrt` implementation for 32-bit, relying on a `prsqrt` implementation with better handling of edge cases. Note that several of the 32-bit NEON packet tests are currently failing - either due to denormal handling (NEON versions flush to zero, but scalar paths don't) or due to accuracy (e.g. sin/cos).
*	Merge branch 'rmlarsen1/eigen-nan_prop'	Rasmus Munk Larsen	2021-02-26
\|\
\| *	Merge branch 'nan_prop' of https://gitlab.com/rmlarsen1/eigen into nan_prop	Rasmus Munk Larsen	2021-02-26
\| \|\
\| * \|	Add TODO.	Rasmus Munk Larsen	2021-02-26
\| \| \|
\| * \|	Defer default for minCoeff/maxCoeff to templated variant.	Rasmus Munk Larsen	2021-02-26
\| \| \|
* \| \|	Fix floor/ceil for NEON fp16.	Antonio Sanchez	2021-02-25
\| \| \| \| \| \| \| \| \| \| \| \|	Forgot to test this. Fixes bug introduced in !416.
* \| \|	Fix SSE/NEON pfloor/pceil for saturated values.	Antonio Sanchez	2021-02-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original will saturate if the input does not fit into an integer type. Here we fix this, returning the input if it doesn't have enough precision to have a fractional part. Also added `pceil` for NEON. Fixes #1969.
\| \| *	Fix indentation.	Rasmus Munk Larsen	2021-02-25
\| \| \|
\| \| *	Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff ↵	Rasmus Munk Larsen	2021-02-25
\| \|/ \|/\| \| \| \| \|	reductions.
* \|	Fix clang compile when no MMA flags are set. Simplify MMA compiler detection.	Chip-Kerchner	2021-02-24
\| \|
\| *	Fix indentation.	Rasmus Munk Larsen	2021-02-24
\| \|
\| *	Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff ↵	Rasmus Munk Larsen	2021-02-25
\|/ \| \| \|	reductions.
*	Remove unused function scalar_cmp_with_cast.	Rasmus Munk Larsen	2021-02-24
\|
*	Cast anonymous enums to int when used in expressions.	Rasmus Munk Larsen	2021-02-24
\|
*	Having forward template function declarations in a P10 file causes bad code ↵	Chip-Kerchner	2021-02-24
\| \| \| \|	in certain situations.
*	Add `invoke_result` and eliminate `result_of` warnings for C++17+.	Antonio Sanchez	2021-02-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `std::result_of` meta struct is deprecated in C++17 and removed in C++20. It was still slipping through due to a faulty definition of `EIGEN_HAS_STD_RESULT_OF`. Added a new macro `EIGEN_HAS_STD_INVOKE_RESULT` and `Eigen::internal::invoke_result` implementation with fallback for pre C++17. Replaces the `result_of` definition with one based on `std::invoke_result` for C++17 and higher. For completeness, added nullary op support for c++03. Fixes #1850.
*	Fixes to support old and new versions of the compilers for built-ins. Cast ↵	Chip-Kerchner	2021-02-24
\| \| \| \|	to non-const when using vector_pair with certain built-ins.
*	Fix CUDA device new and delete, and add test.	Antonio Sanchez	2021-02-24
\| \| \| \|	HIP does not support new/delete on device, so test is skipped.
*	Disable fast psqrt for NEON.	Antonio Sanchez	2021-02-23
\| \| \| \| \| \| \|	Accuracy is too poor - requires at least two Newton iterations, but then it is no longer significantly faster than `vsqrt`. Fixes #2094.
*	Fix check if GPU compile phase for std::hash	Antonio Sanchez	2021-02-23
\|
*	Fix some CUDA warnings.	Antonio Sanchez	2021-02-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added `EIGEN_HAS_STD_HASH` macro, checking for C++11 support and not running on GPU. `std::hash<float>` is not a device function, so cannot be used by `std::hash<bfloat16>`. Removed `EIGEN_DEVICE_FUNC` and only define if `EIGEN_HAS_STD_HASH`. Same for `half`. Added `EIGEN_CUDA_HAS_FP16_ARITHMETIC` to improve readability, eliminate warnings about `EIGEN_CUDA_ARCH` not being defined. Replaced a couple C-style casts with `reinterpret_cast` for aligned loading of `half` to `half2`. This eliminates `-Wcast-align` warnings in clang. Although not ideal due to potential type aliasing, this is how CUDA handles these conversions internally.
*	Accurate pow, part 2. This change adds specializations of log2 and exp2 for ↵	Rasmus Munk Larsen	2021-02-23
\| \| \| \| \| \| \|	double that make pow<double> accurate the 1 ULP. Speed for AVX-512 is within 0.5% of the currect implementation.
*	Fixed sparse conservativeResize() when both num cols and rows decreased.	Adam Shapiro	2021-02-23
\| \| \| \| \|	The previous implementation caused a buffer overflow trying to calculate non- zero counts for columns that no longer exist.
*	Fix compilation errors with later versions of GCC and use of MMA.	Chip-Kerchner	2021-02-22
\|
*	Fixes Bug #1925. Packets should be passed by const reference, even to inline ↵	Christoph Hertzberg	2021-02-20
\| \| \| \|	functions.
*	Bug #1910: Make SparseCholesky work for RowMajor matrices	Christoph Hertzberg	2021-02-19
\|
*	Revert "add EIGEN_DEVICE_FUNC to EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF macros ↵	Antonio Sánchez	2021-02-19
\| \| \| \| \|	(only if not HIPCC)." This reverts commit 12fd3dd655e37ba26e7ab236d32163e0aa35da39
*	Use the Cephes double subtraction trick in pexp<float> even when FMA is ↵	Rasmus Munk Larsen	2021-02-18
\| \| \| \|	available. Otherwise the accuracy drops from 1 ulp to 3 ulp.
*	add EIGEN_DEVICE_FUNC to EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF macros (only if ↵	Masaki Murooka	2021-02-17
\| \| \| \|	not HIPCC).
*	Bump to 3.4.99	David Tellenbach	2021-02-17
\|
*	Define internal::make_unsigned for [unsigned]long long on macOS.	David Tellenbach	2021-02-17
\| \| \| \| \| \| \| \| \| \|	macOS defines int64_t as long long even for C++03 and therefore expects a template specialization internal::make_unsigned<long long>, for C++03. Since other platforms define int64_t as long for C++03 we cannot add the specialization for all cases.
*	Fix uninitialized warning on AVX.	Antonio Sanchez	2021-02-17
\|
*	Fixed performance issues for VSX and P10 MMA in general_matrix_matrix_product	Chip Kerchner	2021-02-17
\|
*	New accurate algorithm for pow(x,y). This version is accurate to 1.4 ulps ↵	Rasmus Munk Larsen	2021-02-17
\| \| \| \|	for float, while still being 10x faster than std::pow for AVX512. A future change will introduce a specialization for double.
*	Updated pfrexp implementation.	Antonio Sanchez	2021-02-17
\| \| \| \| \| \|	The original implementation fails for 0, denormals, inf, and NaN. See #2150
*	missing method in packetmath.h void ptranspose(PacketBlock<Packet16uc, 4>& ↵	Ashutosh Sharma	2021-02-16
\| \| \| \|	kernel)
*	Avoid -Wunused warnings in NDEBUG builds.	Jan van Dijk	2021-02-12
\| \| \| \| \| \| \| \|	In two places in SuperLUSupport.h, a local variable 'size' is created that is used only inside an eigen_assert. Remove these, just fetch the required values inside the assert statements. This avoids annoying -Wunused warnings (and -Werror=unused errors) in NDEBUG builds.
*	Use vrsqrts for rsqrt Newton iterations.	Antonio Sanchez	2021-02-11
\| \| \| \| \|	It's slightly faster and slightly more accurate, allowing our current packetmath tests to pass for sqrt with a single iteration.
*	Adjust bounds for pexp_float/double	Antonio Sanchez	2021-02-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original clamping bounds on `_x` actually produce finite values: ``` exp(88.3762626647950) = 2.40614e+38 < 3.40282e+38 exp(709.437) = 1.27226e+308 < 1.79769e+308 ``` so with an accurate `ldexp` implementation, `pexp` fails for large inputs, producing finite values instead of `inf`. This adjusts the bounds slightly outside the finite range so that the output will overflow to +/- `inf` as expected.
*	Fix ldexp implementations.	Antonio Sanchez	2021-02-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous implementations produced garbage values if the exponent did not fit within the exponent bits. See #2131 for a complete discussion, and !375 for other possible implementations. Here we implement the 4-factor version. See `pldexp_impl` in `GenericPacketMathFunctions.h` for a full description. The SSE `pcmp*` methods were moved down since `pcmp_le<Packet4i>` requires `por`. Left as a "TODO" is to delegate to a faster version if we know the exponent does fit within the exponent bits. Fixes #2131.
*	loop less ptranspose	Ashutosh Sharma	2021-02-10
\|
*	Remove vim specific comments to recognoize correct file-type.	David Tellenbach	2021-02-09
\| \| \| \|	As discussed in #2143 we remove editor specific comments.
*	Replace nullptr by NULL in SparseLU.h to be C++03 compliant.	David Tellenbach	2021-02-09
\|
*	add specialization of check_sparse_solving() for SuperLU solver, in order to ↵	Ralf Hannemann-Tamas	2021-02-08
\| \| \| \|	test adjoint and transpose solves
*	Fix documentation typos in LDLT.h	Nikolaus Demmel	2021-02-08
\|
*	Enable bdcsvd on host.	Antonio Sanchez	2021-02-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently if compiled by NVCC, the `MatrixBase::bdcSvd()` implementation is skipped, leading to a linker error. This prevents it from running on the host as well. Seems it was disabled 6 years ago (5384e891) to match `jacobiSvd`, but `jacobiSvd` is now enabled on host. Tested and runs fine on host, but will not compile/run for device (though it's not labelled as a device function, so this should be fine). Fixes #2139
*	Add more tests for pow and fix a corner case for huge exponent where the ↵	Rasmus Munk Larsen	2021-02-05
\| \| \| \|	result is always zero or infinite unless x is one.