eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
...
*	Add recursive work splitting to EvalShardedByInnerDimContext	Eugene Zhulenev	2019-12-05
\|
*	Improve performance of contraction kernels	Artem Belevich	2019-12-05
\| \| \| \| \| \| \| \| \| \|	* Force-inline implementations. They pass around pointers to shared memory blocks. Without inlining compiler must operate via generic pointers. Inlining allows compiler to detect that we're operating on shared memory which allows generation of substantially faster code. * Fixed a long-standing typo which resulted in launching 8x more kernels than we needed (.z dimension of the block is unused by the kernel).
*	Capture TensorMap by value inside tensor expression AST	Eugene Zhulenev	2019-12-03
\|
*	Remove __host__ annotation for device-only function.	Rasmus Munk Larsen	2019-12-03
\|
*	Use EIGEN_DEVICE_FUNC macro instead of __device__.	Rasmus Munk Larsen	2019-12-03
\|
*	[SYCL] Rebasing the SYCL support branch on top of the Einge upstream master ↵	Mehdi Goli	2019-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake
*	Add async evaluation support to TensorReverse	Eugene Zhulenev	2019-11-26
\|
*	Add async evaluation support to TensorPadding/TensorImagePatch/TensorShuffling	Eugene Zhulenev	2019-11-26
\|
*	Remove legacy block evaluation support	Eugene Zhulenev	2019-11-12
\|
*	Fix a race in async tensor evaluation: Don't run on_done() until after ↵	Rasmus Munk Larsen	2019-11-11
\| \| \| \|	device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs.
*	Break loop dependence in TensorGenerator block access	Eugene Zhulenev	2019-11-11
\|
*	Add EIGEN_HAS_INTRINSIC_INT128 macro	Rasmus Munk Larsen	2019-11-06
\| \| \| \|	Add a new EIGEN_HAS_INTRINSIC_INT128 macro, and use this instead of __SIZEOF_INT128__. This fixes related issues with TensorIntDiv.h when building with Clang for Windows, where support for 128-bit integer arithmetic is advertised but broken in practice.
*	Rollback or PR-746 and partial rollback of ↵	Rasmus Munk Larsen	2019-11-05
\| \| \| \| \| \| \| \|	https://bitbucket.org/eigen/eigen/commits/668ab3fc474e54c7919eda4fbaf11f3a99246494 . std::array is still not supported in CUDA device code on Windows.
*	Remove internal::smart_copy and replace with std::copy	Eugene Zhulenev	2019-10-29
\|
*	Fix CXX11Meta compilation with MSVC	Eugene Zhulenev	2019-10-28
\|
*	Prevent potential ODR in TensorExecutor	Eugene Zhulenev	2019-10-28
\|
*	This PR fixes:	Mehdi Goli	2019-10-23
\| \| \| \| \|	* The specialization of array class in the different namespace for GCC<=6.4 * The implicit call to `std::array` constructor using the initializer list for GCC <=6.1
*	Merged in deven-amd/eigen-hip-fix-191018 (pull request PR-738)	Rasmus Larsen	2019-10-22
\|\ \| \| \| \| \| \|	Fix for the HIP build+test errors.
* \|	Add block evaluation V2 to TensorAsyncExecutor.	Rasmus Munk Larsen	2019-10-22
\| \| \| \| \| \| \| \|	Add async evaluation to a number of ops.
\| *	Fix for the HIP build+test errors.	Deven Desai	2019-10-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The errors were introduced by this commit : After the above mentioned commit, some of the tests started failing with the following error ``` Built target cxx11_tensor_reduction Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:117: /home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:155:5: error: the field type is not amp-compatible DestinationBufferKind m_kind; ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:211:3: error: the field type is not amp-compatible DestinationBuffer m_destination; ^ ``` For some reason HIPCC does not like device code to contain enum types which do not have the base-type explicitly declared. The fix is trivial, explicitly state "int" as the basetype
* \|	Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate ↵	Rasmus Munk Larsen	2019-10-18
\|/ \| \| \|	c++11 functionality with older compilers.
*	Propagate block evaluation preference through rvalue tensor expressions	Eugene Zhulenev	2019-10-17
\|
*	Cleanup Tensor block destination and materialized block storage allocation	Eugene Zhulenev	2019-10-16
\|
*	TensorBroadcasting support for random/uniform blocks	Eugene Zhulenev	2019-10-16
\|
*	Block evaluation for TensorGenerator/TensorReverse/TensorShuffling	Eugene Zhulenev	2019-10-14
\|
*	Block evaluation for TensorGenerator + TensorReverse + fixed bug in tensor ↵	Eugene Zhulenev	2019-10-10
\| \| \| \|	reverse op
*	Block evaluation for TensorChipping + fixed bugs in TensorPadding and ↵	Eugene Zhulenev	2019-10-09
\| \| \| \|	TensorSlicing
*	Add block evaluation to TensorEvalTo and fix few small bugs	Eugene Zhulenev	2019-10-07
\|
*	Fixing incorrect size in Tensor documentation.	Brian Zhao	2019-10-04
\|
*	Fix compilation warnings and errors with clang in TensorBlockV2 code and tests	Eugene Zhulenev	2019-10-04
\|
*	Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect	Eugene Zhulenev	2019-10-02
\|
*	Add beta to TensorContractionKernel and make memset optional	Eugene Zhulenev	2019-10-02
\|
*	Fix compilation warnings and errors with clang in TensorBlockV2	Eugene Zhulenev	2019-09-25
\|
*	Fix a bug in a packed block type in TensorContractionThreadPool	Eugene Zhulenev	2019-09-24
\|
*	Choose TensorBlock StridedLinearCopy type statically	Eugene Zhulenev	2019-09-24
\|
*	Add new TensorBlock api implementation + tests	Eugene Zhulenev	2019-09-24
\|
*	Tensor block evaluation V2 support for unary/binary/broadcsting	Eugene Zhulenev	2019-09-24
\|
*	Fix (or mask away) conversion warnings introduced in ↵	Christoph Hertzberg	2019-09-23
\| \| \| \| \| \|	553caeb6a3bb545aef895f8fc9f219be44679017 .
*	Add support for asynchronous evaluation of tensor casting expressions.	Rasmus Munk Larsen	2019-09-19
\|
*	Merging eigen/eigen.	Srinivas Vasudevan	2019-09-16
\|\
* \|	Add Bessel functions to SpecialFunctions.	Srinivas Vasudevan	2019-09-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Split SpecialFunctions files in to a separate BesselFunctions file. In particular add: - Modified bessel functions of the second kind k0, k1, k0e, k1e - Bessel functions of the first kind j0, j1 - Bessel functions of the second kind y0, y1
\| *	Fix maybe-unitialized warnings in TensorContractionThreadPool	Eugene Zhulenev	2019-09-13
\| \|
\| *	Use ThreadLocal container in TensorContractionThreadPool	Eugene Zhulenev	2019-09-13
\|/
*	Update ThreadLocal to use separate Initialize/Release callables	Eugene Zhulenev	2019-09-10
\|
*	ThreadLocal container that does not rely on thread local storage	Eugene Zhulenev	2019-09-09
\|
*	PR 681: Add ndtri function, the inverse of the normal distribution function.	Srinivas Vasudevan	2019-08-12
\|
*	Allow move-only done callback in TensorAsyncDevice	Eugene Zhulenev	2019-09-03
\|
*	TensorMap constness should not change underlying storage constness	Eugene Zhulenev	2019-09-03
\|
*	Fixed Tensor documentation formatting.	Alberto Luaces	2019-07-23
\|
*	Fix shadow warnings in TensorContractionThreadPool	Eugene Zhulenev	2019-08-30
\|