eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Merged in rmlarsen/eigen2 (pull request PR-543)	Rasmus Munk Larsen	2018-11-13
\|\ \| \| \| \| \| \| \| \| \| \|	Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth. Approved-by: Eugene Zhulenev <ezhulenev@google.com>
\| *	Remove accidental changes.	Rasmus Munk Larsen	2018-11-12
\| \|
\| *	Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number ↵	Rasmus Munk Larsen	2018-11-12
\| \| \| \| \| \| \| \|	of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.
* \|	[PATCH 1/2] Misc. typos	luz.paz"	2018-09-18
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From 68d431b4c14ad60a778ee93c1f59ecc4b931950e Mon Sep 17 00:00:00 2001 Found via `codespell -q 3 -I ../eigen-word-whitelist.txt` where the whitelists consists of: ``` als ans cas dum lastr lowd nd overfl pres preverse substraction te uint whch ``` --- CMakeLists.txt \| 26 +++++++++---------- Eigen/src/Core/GenericPacketMath.h \| 2 +- Eigen/src/SparseLU/SparseLU.h \| 2 +- bench/bench_norm.cpp \| 2 +- doc/HiPerformance.dox \| 2 +- doc/QuickStartGuide.dox \| 2 +- .../Eigen/CXX11/src/Tensor/TensorChipping.h \| 6 ++--- .../Eigen/CXX11/src/Tensor/TensorDeviceGpu.h \| 2 +- .../src/Tensor/TensorForwardDeclarations.h \| 4 +-- .../src/Tensor/TensorGpuHipCudaDefines.h \| 2 +- .../Eigen/CXX11/src/Tensor/TensorReduction.h \| 2 +- .../CXX11/src/Tensor/TensorReductionGpu.h \| 2 +- .../test/cxx11_tensor_concatenation.cpp \| 2 +- unsupported/test/cxx11_tensor_executor.cpp \| 2 +- 14 files changed, 29 insertions(+), 29 deletions(-)
*	A few small fixes to a) prevent throwing in ctors and dtors of the threading ↵	Rasmus Munk Larsen	2018-11-09
\| \| \| \|	code, and b) supporting matrix exponential on platforms with 113 bits of mantissa for long doubles.
*	Merged in ezhulenev/eigen-02 (pull request PR-534)	Rasmus Munk Larsen	2018-10-25
\|\ \| \| \| \| \| \|	Fix cxx11_tensor_{block_access, reduction} tests
\| *	Fix cxx11_tensor_{block_access, reduction} tests	Eugene Zhulenev	2018-10-25
\| \|
* \|	Fix most Doxygen warnings. Also add links to stable documentation from ↵	Christoph Hertzberg	2018-10-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	unsupported modules (by using the corresponding Doxytags file). Manually grafted from d107a371c61b764c73fd1570b1f3ed1c6400dd7e
* \|	bug #1606: Explicitly set the standard before ↵	Christoph Hertzberg	2018-10-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	find_package(StandardMathLibrary). Also replace EIGEN_COMPILER_SUPPORT_CXX11 in favor of EIGEN_COMPILER_SUPPORT_CPP11. Grafted manually from a4afa90d161faab385a77f0e2764fb13ff3b9484
* \|	Fix GPU build due to gpu_assert not always being defined.	Rasmus Munk Larsen	2018-10-18
\|/
*	Move from rvalue arguments in ThreadPool enqueue* methods	Eugene Zhulenev	2018-10-16
\|
*	Reduce thread scheduling overhead in parallelFor	Eugene Zhulenev	2018-10-16
\|
*	Merged in ezhulenev/eigen-02 (pull request PR-528)	Rasmus Munk Larsen	2018-10-16
\|\ \| \| \| \| \| \| \| \| \| \|	[TensorBlockIO] Check if it's allowed to squeeze inner dimensions Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
\| *	Check if it's allowed to squueze inner dimensions in TensorBlockIO	Eugene Zhulenev	2018-10-15
\| \|
* \|	Iterative solvers: unify and fix handling of multiple rhs.	Gael Guennebaud	2018-10-15
\| \| \| \| \| \| \| \|	m_info was not properly computed and the logic was repeated in several places.
* \|	DGMRES: fix null rhs, fix restart, fix m_isDeflInitialized for multiple solve	Gael Guennebaud	2018-10-15
\|/
*	relax number of iterations checks to avoid false negatives	Gael Guennebaud	2018-10-15
\|
*	Make sparse_basic includable from sparse_extra, but disable it since ↵	Gael Guennebaud	2018-10-11
\| \| \| \|	sparse_basic(DynamicSparseMatrix) does not compile at all anyways
*	Fix a lot of Doxygen warnings in Tensor module	Christoph Hertzberg	2018-10-09
\|
*	fix mpreal for mpfr<4.0.0	Gael Guennebaud	2018-10-09
\|
*	Fix out-of bounds access in TensorArgMax.h.	Rasmus Munk Larsen	2018-10-08
\|
*	Fix contraction test.	Rasmus Munk Larsen	2018-10-08
\|
*	typo	Gael Guennebaud	2018-10-08
\|
*	fix warning in mpreal.h	Gael Guennebaud	2018-10-08
\|
*	Update included mpreal header to 3.6.5 and fix deprecated warnings.	Gael Guennebaud	2018-10-08
\|
*	Workaround stupid warning	Gael Guennebaud	2018-10-08
\|
*	Fix shadow warning	Christoph Hertzberg	2018-10-02
\|
*	Move struct outside of method for C++03 compatibility.	Christoph Hertzberg	2018-10-02
\|
*	Make code compile in C++03 mode again	Christoph Hertzberg	2018-10-02
\|
*	Fix conversion warning ... again	Christoph Hertzberg	2018-10-02
\|
*	Merged in deven-amd/eigen/HIP_fixes (pull request PR-518)	Christoph Hertzberg	2018-10-01
\|\ \| \| \| \| \| \|	PR with HIP specific fixes (for the eigen nightly regression failures in HIP mode)
\| *	This commit contains the following (HIP specific) updates:	Deven Desai	2018-10-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- unsupported/Eigen/CXX11/src/Tensor/TensorReductionGpu.h Changing "pass-by-reference" argument to be "pass-by-value" instead (in a __global__ function decl). "pass-by-reference" arguments to __global__ functions are unwise, and will be explicitly flagged as errors by the newer versions of HIP. - Eigen/src/Core/util/Memory.h - unsupported/Eigen/CXX11/src/Tensor/TensorContraction.h Changes introduced in recent commits breaks the HIP compile. Adding EIGEN_DEVICE_FUNC attribute to some functions and calling ::malloc/free instead of the corresponding std:: versions to get the HIP compile working again - unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h Change introduced a recent commit breaks the HIP compile (link stage errors out due to failure to inline a function). Disabling the recently introduced code (only for HIP compile), to get the eigen nightly testing going again. Will submit another PR once we have te proper fix. - Eigen/src/Core/util/ConfigureVectorization.h Enabling GPU VECTOR support when HIP compiler is in use (for both the host and device compile phases)
* \|	Merged eigen/eigen into default	Rasmus Munk Larsen	2018-09-28
\|\ \
* \| \|	Get rid of unused variable warning.	Rasmus Munk Larsen	2018-09-28
\| \| \|
\| * \|	Fix bug in copy optimization in Tensor slicing.	Eugene Zhulenev	2018-09-28
\|/ /
* \|	Fix a few warnings and rename a variable to not shadow "last".	Rasmus Munk Larsen	2018-09-28
\| \|
* \|	Merged in ezhulenev/eigen-01 (pull request PR-514)	Rasmus Munk Larsen	2018-09-28
\|\ \ \| \| \| \| \| \| \| \| \|	Add tests for evalShardedByInnerDim contraction + fix bugs
\| * \|	Add tests for evalShardedByInnerDim contraction + fix bugs	Eugene Zhulenev	2018-09-28
\| \|/
* \|	Fix integer conversion warnings	Christoph Hertzberg	2018-09-28
\| \|
* \|	Provide EIGEN_OVERRIDE and EIGEN_FINAL macros to mark virtual function overrides	Christoph Hertzberg	2018-09-24
\|/
*	Optimize TensorBlockCopyOp	Eugene Zhulenev	2018-09-27
\|
*	Revert code lost in merge	Eugene Zhulenev	2018-09-27
\|
*	Merge with eigen/eigen default	Eugene Zhulenev	2018-09-27
\|\
* \|	Remove explicit mkldnn support and redundant TensorContractionKernelBlocking	Eugene Zhulenev	2018-09-27
\| \|
* \|	Test mkldnn pack for doubles	Eugene Zhulenev	2018-09-26
\| \|
* \|	Conditionally add mkldnn test	Eugene Zhulenev	2018-09-26
\| \|
\| *	Remove "false &&" left over from test.	Rasmus Munk Larsen	2018-09-26
\| \|
\| *	Parallelize tensor contraction over the inner dimension in cases where where ↵	Rasmus Munk Larsen	2018-09-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	one or both of the outer dimensions (m and n) are small but k is large. This speeds up individual matmul microbenchmarks by up to 85%. Naming below is BM_Matmul_M_K_N_THREADS, measured on a 2-socket Intel Broadwell-based server. Benchmark Base (ns) New (ns) Improvement ------------------------------------------------------------------ BM_Matmul_1_80_13522_1 387457 396013 -2.2% BM_Matmul_1_80_13522_2 406487 230789 +43.2% BM_Matmul_1_80_13522_4 395821 123211 +68.9% BM_Matmul_1_80_13522_6 391625 97002 +75.2% BM_Matmul_1_80_13522_8 408986 113828 +72.2% BM_Matmul_1_80_13522_16 399988 67600 +83.1% BM_Matmul_1_80_13522_22 411546 60044 +85.4% BM_Matmul_1_80_13522_32 393528 57312 +85.4% BM_Matmul_1_80_13522_44 390047 63525 +83.7% BM_Matmul_1_80_13522_88 387876 63592 +83.6% BM_Matmul_1_1500_500_1 245359 248119 -1.1% BM_Matmul_1_1500_500_2 401833 143271 +64.3% BM_Matmul_1_1500_500_4 210519 100231 +52.4% BM_Matmul_1_1500_500_6 251582 86575 +65.6% BM_Matmul_1_1500_500_8 211499 80444 +62.0% BM_Matmul_3_250_512_1 70297 68551 +2.5% BM_Matmul_3_250_512_2 70141 52450 +25.2% BM_Matmul_3_250_512_4 67872 58204 +14.2% BM_Matmul_3_250_512_6 71378 63340 +11.3% BM_Matmul_3_250_512_8 69595 41652 +40.2% BM_Matmul_3_250_512_16 72055 42549 +40.9% BM_Matmul_3_250_512_22 70158 54023 +23.0% BM_Matmul_3_250_512_32 71541 56042 +21.7% BM_Matmul_3_250_512_44 71843 57019 +20.6% BM_Matmul_3_250_512_88 69951 54045 +22.7% BM_Matmul_3_1500_512_1 369328 374284 -1.4% BM_Matmul_3_1500_512_2 428656 223603 +47.8% BM_Matmul_3_1500_512_4 205599 139508 +32.1% BM_Matmul_3_1500_512_6 214278 139071 +35.1% BM_Matmul_3_1500_512_8 184149 142338 +22.7% BM_Matmul_3_1500_512_16 156462 156983 -0.3% BM_Matmul_3_1500_512_22 163905 158259 +3.4% BM_Matmul_3_1500_512_32 155314 157662 -1.5% BM_Matmul_3_1500_512_44 235434 158657 +32.6% BM_Matmul_3_1500_512_88 156779 160275 -2.2% BM_Matmul_1500_4_512_1 363358 349528 +3.8% BM_Matmul_1500_4_512_2 303134 263319 +13.1% BM_Matmul_1500_4_512_4 176208 130086 +26.2% BM_Matmul_1500_4_512_6 148026 115449 +22.0% BM_Matmul_1500_4_512_8 131656 98421 +25.2% BM_Matmul_1500_4_512_16 134011 82861 +38.2% BM_Matmul_1500_4_512_22 134950 85685 +36.5% BM_Matmul_1500_4_512_32 133165 90081 +32.4% BM_Matmul_1500_4_512_44 133203 90644 +32.0% BM_Matmul_1500_4_512_88 134106 100566 +25.0% BM_Matmul_4_1500_512_1 439243 435058 +1.0% BM_Matmul_4_1500_512_2 451830 257032 +43.1% BM_Matmul_4_1500_512_4 276434 164513 +40.5% BM_Matmul_4_1500_512_6 182542 144827 +20.7% BM_Matmul_4_1500_512_8 179411 166256 +7.3% BM_Matmul_4_1500_512_16 158101 155560 +1.6% BM_Matmul_4_1500_512_22 152435 155448 -1.9% BM_Matmul_4_1500_512_32 155150 149538 +3.6% BM_Matmul_4_1500_512_44 193842 149777 +22.7% BM_Matmul_4_1500_512_88 149544 154468 -3.3%
* \|	Support multiple contraction kernel types in TensorContractionThreadPool	Eugene Zhulenev	2018-09-26
\|/
*	Don't deactivate BVH test for clang (probably, this was failing for very old ↵	Christoph Hertzberg	2018-09-25
\| \| \| \|	versions of clang)