eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Fix regression: .conjugate() was popped out but not re-introduced.	Gael Guennebaud	2019-02-18
\|
*	GEMM: catch all scalar-multiple variants when falling-back to a coeff-based ↵	Gael Guennebaud	2019-02-18
\| \| \| \| \| \| \|	product. Before only sAB was caught which was both inconsistent with GEMM, sub-optimal, and could even lead to compilation-errors (https://stackoverflow.com/questions/54738495).
*	bug #1680: improve MSVC inlining by declaring many triavial constructors and ↵	Gael Guennebaud	2019-02-15
\| \| \| \|	accessors as STRONG_INLINE.
*	PR 526: Speed up multiplication of small, dynamically sized matrices	Mark D Ryan	2018-10-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Packet16f, Packet8f and Packet8d types are too large to use with dynamically sized matrices typically processed by the SliceVectorizedTraversal specialization of the dense_assignment_loop. Using these types is likely to lead to little or no vectorization. Significant slowdown in the multiplication of these small matrices can be observed when building with AVX and AVX512 enabled. This patch introduces a new dense_assignment_kernel that is used when computing small products whose operands have dynamic dimensions. It ensures that the PacketSize used is no larger than 4, thereby increasing the chance that vectorized instructions will be used when computing the product. I tested all 969 possible combinations of M, K, and N that are handled by the dense_assignment_loop on x86 builds. Although a few combinations are slowed down by this patch they are far outnumbered by the cases that are sped up, as the following results demonstrate. Disabling Packed8d on AVX512 builds: Total Cases: 969 Better: 511 Worse: 85 Same: 373 Max Improvement: 169.00% (4 8 6) Max Degradation: 36.50% (8 5 3) Median Improvement: 35.46% Median Degradation: 17.41% Total FLOPs Improvement: 19.42% Disabling Packet16f and Packed8f on AVX512 builds: Total Cases: 969 Better: 658 Worse: 5 Same: 306 Max Improvement: 214.05% (8 6 5) Max Degradation: 22.26% (16 2 1) Median Improvement: 60.05% Median Degradation: 13.32% Total FLOPs Improvement: 59.58% Disabling Packed8f on AVX builds: Total Cases: 969 Better: 663 Worse: 96 Same: 210 Max Improvement: 155.29% (4 10 5) Max Degradation: 35.12% (8 3 2) Median Improvement: 34.28% Median Degradation: 15.05% Total FLOPs Improvement: 26.02%
*	Fix logic in diagonal*dense product in a corner case.	Gael Guennebaud	2018-09-22
\| \| \| \|	The problem was for: diag(1x1) * mat(1,n)
*	Fix doxy and misc. typos	luz.paz"	2018-08-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Found via `codespell -q 3 -I ../eigen-word-whitelist.txt` --- Eigen/src/Core/ProductEvaluators.h \| 4 ++-- Eigen/src/Core/arch/GPU/Half.h \| 2 +- Eigen/src/Core/util/Memory.h \| 2 +- Eigen/src/Geometry/Hyperplane.h \| 2 +- Eigen/src/Geometry/Transform.h \| 2 +- Eigen/src/Geometry/Translation.h \| 12 ++++++------ doc/PreprocessorDirectives.dox \| 2 +- doc/TutorialGeometry.dox \| 2 +- test/boostmultiprec.cpp \| 2 +- test/triangular.cpp \| 2 +- 10 files changed, 16 insertions(+), 16 deletions(-)
*	merging updates from upstream	Deven Desai	2018-07-11
\|\
\| *	Introduce the macro ei_declare_local_nested_eval to help allocating on the ↵	Gael Guennebaud	2018-07-09
\| \| \| \| \| \| \| \| \| \| \| \|	stack local temporaries via alloca, and let outer-products makes a good use of it. If successful, we should use it everywhere nested_eval is used to declare local dense temporaries.
* \|	updates based on PR feedback	Deven Desai	2018-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
* \|	syncing this fork with upstream	Deven Desai	2018-06-13
\|\ \
\| \| *	Extend CUDA support to matrix inversion and selfadjointeigensolver	Andrea Bocci	2018-06-11
\| \| \|
\| \| *	bug #1562: optimize evaluation of small products of the form sAB by ↵	Gael Guennebaud	2018-07-02
\| \| \| \| \| \| \| \| \| \| \| \|	rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x...
\| \| *	bug #1560 fix product with a 1x1 diagonal matrix	Gael Guennebaud	2018-06-25
\| \|/
\| *	Missing line during manual rebase of PR-374	Christoph Hertzberg	2018-06-07
\| \|
* \|	Adding support for using Eigen in HIP kernels.	Deven Desai	2018-06-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
\| *	Adding EIGEN_DEVICE_FUNC to Products, especially Dense2Dense Assignment	Robert Lukierski	2018-03-14
\|/ \| \| \| \|	specializations. Otherwise causes problems with small fixed size matrix multiplication (call to 0x00 in call_assignment_no_alias in debug mode or trap in release with CUDA 9.1).
*	Adds missing EIGEN_STRONG_INLINE to support MSVC properly inlining small ↵	Basil Fierz	2017-10-26
\| \| \| \| \| \|	vector calculations When working with MSVC often small vector operations are not properly inlined. This behaviour is observed even on the most recent compiler versions.
*	Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH ↵	Gael Guennebaud	2017-07-17
\| \| \| \|	aliases
*	bug #1435: fix aliasing issue in exressions like: A = C - B*A;	Gael Guennebaud	2017-06-08
\|
*	Operators += and -= do not resize!	Gael Guennebaud	2016-12-02
\|
*	Fix a performance regression in (matmat)vec for which mat*mat was ↵	Gael Guennebaud	2016-11-30
\| \| \| \|	evaluated multiple times.
*	Fix regression in X = (X*X.transpose())/s with X rectangular by deferring ↵	Gael Guennebaud	2016-10-26
\| \| \| \|	resizing of the destination after the creation of the evaluator of the source expression.
*	Fix ICC warnings	Gael Guennebaud	2016-10-25
\|
*	Use explicit type casting to generate packets of zeros.	Benoit Steiner	2016-10-04
\|
*	bug #1308: fix compilation of some small products involving nullary-expressions.	Gael Guennebaud	2016-09-29
\|
*	Add debug info.	Gael Guennebaud	2016-09-26
\|
*	bug #1311: fix alignment logic in some cases of ↵	Gael Guennebaud	2016-09-26
\| \| \| \|	(scalar*small).lazyProduct(small)
*	bug #1308: fix compilation of vector * rowvector::nullary.	Gael Guennebaud	2016-09-25
\|
*	bug #1283: quick fix for products involving uncommon general block access to ↵	Gael Guennebaud	2016-08-31
\| \| \| \|	vectors.
*	Optimize expression matching "d?=a-bc" as "d?=a; d?=bc;"	Gael Guennebaud	2016-08-23
\|
*	Fix vectorization logic for coeff-based product for some corner cases.	Gael Guennebaud	2016-07-31
\|
*	Vectorize more small product expressions by letting the general assignement ↵	Gael Guennebaud	2016-07-28
\| \| \| \|	logic decides on the sizes that are OK for vectorization.
*	Allows the compiler to inline outer products (the change from default to ↵	Gael Guennebaud	2016-07-22
\| \| \| \| \| \|	dont-inline in changeset 737bed19c1fdb01568706bca19666531dda681a7 was not motivated)
*	Re-enable some specializations for Assignment<.,Product<>>	Gael Guennebaud	2016-07-05
\|
*	Fix template resolution.	Gael Guennebaud	2016-07-04
\|
*	Implement scalar multiples and division by a scalar as a binary-expression ↵	Gael Guennebaud	2016-06-14
\| \| \| \| \| \| \| \| \| \| \| \|	with a constant expression. This slightly complexifies the type of the expressions and implies that we now have to distinguish between scalarexpr and exprscalar to catch scalar-multiple expression (e.g., see BlasUtil.h), but this brings several advantages: - it makes it clear on each side the scalar is applied, - it clearly reflects that we are dealing with a binary-expression, - the complexity of the type is hidden through macros defined at the end of Macros.h, - distinguishing between "scalar op expr" and "expr op scalar" is important to support non commutative fields (like quaternions) - "scalar op expr" is now fully equivalent to "ConstantExpr(scalar) op expr" - scalar_multiple_op, scalar_quotient1_op and scalar_quotient2_op are not used anymore in officially supported modules (still used in Tensor)
*	Disable shortcuts for res ?= prod when the scalar types do not match exactly.	Gael Guennebaud	2016-06-06
\|
*	Relax mixing-type constraints for binary coefficient-wise operators:	Gael Guennebaud	2016-06-06
\| \| \| \| \| \| \| \| \| \|	- Replace internal::scalar_product_traits<A,B> by Eigen::ScalarBinaryOpTraits<A,B,OP> - Remove the "functor_is_product_like" helper (was pretty ugly) - Currently, OP is not used, but it is available to the user for fine grained tuning - Currently, only the following operators have been generalized: ,/,+,-,=,=,/=,+=,-= - TODO: generalize all other binray operators (comparisons,pow,etc.) - TODO: handle "scalar op array" operators (currently only * is handled) - TODO: move the handling of the "void" scalar type to ScalarBinaryOpTraits
*	bug #1181: help MSVC inlining.	Gael Guennebaud	2016-05-31
\|
*	Fix static/inline order.	Gael Guennebaud	2016-05-25
\|
*	bug #1207: Add and fix logical-op warnings	Christoph Hertzberg	2016-05-11
\|
*	Make use of is_same_dense helper instead of extract_data to detect ↵	Gael Guennebaud	2016-04-13
\| \| \| \|	input/outputs are the same.
*	Fix incomplete previous patch on matrix comparision.	Gael Guennebaud	2016-04-13
\|
*	Fix detection of same matrices when both matrices are not handled by ↵	Gael Guennebaud	2016-04-13
\| \| \| \|	extract_data.
*	Enable the use of half-packet in coeff-based product.	Gael Guennebaud	2016-04-12
\| \| \| \|	For instance, Matrix4f*Vector4f is now vectorized again when using AVX.
*	Removed executable bit from header files	Benoit Steiner	2016-03-23
\|
*	Improve inlining	Gael Guennebaud	2016-02-08
\|
*	bug #1144: fix regression in x=y+A*x (aliasing), and move ↵	Gael Guennebaud	2016-01-09
\| \| \| \|	evaluator_traits::AssumeAliasing to evaluator_assume_aliasing.
*	Fix sign-unsigned issue in enum	Gael Guennebaud	2015-12-09
\|
*	Fix Alignment in coeff-based product, and enable unaligned vectorization	Gael Guennebaud	2015-12-08
\|