eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Fail at compile time if default executor tries to use non-default device	Eugene Zhulenev	2020-02-06
\|
*	Tensor block evaluation cost model	Eugene Zhulenev	2019-12-18
\|
*	Reduce block evaluation overhead for small tensor expressions	Eugene Zhulenev	2019-12-17
\|
*	Add back accidentally deleted default constructor to ↵	Eugene Zhulenev	2019-12-11
\| \| \| \|	TensorExecutorTilingContext.
*	Remove block memory allocation required by removed block evaluation API	Eugene Zhulenev	2019-12-10
\|
*	Remove V2 suffix from TensorBlock	Eugene Zhulenev	2019-12-10
\|
*	Remove TensorBlock.h and old TensorBlock/BlockMapper	Eugene Zhulenev	2019-12-10
\|
*	Do not use std::vector in getResourceRequirements	Eugene Zhulenev	2019-12-09
\|
*	Use EIGEN_DEVICE_FUNC macro instead of __device__.	Rasmus Munk Larsen	2019-12-03
\|
*	[SYCL] Rebasing the SYCL support branch on top of the Einge upstream master ↵	Mehdi Goli	2019-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake
*	Remove legacy block evaluation support	Eugene Zhulenev	2019-11-12
\|
*	Fix a race in async tensor evaluation: Don't run on_done() until after ↵	Rasmus Munk Larsen	2019-11-11
\| \| \| \|	device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs.
*	Prevent potential ODR in TensorExecutor	Eugene Zhulenev	2019-10-28
\|
*	Add block evaluation V2 to TensorAsyncExecutor.	Rasmus Munk Larsen	2019-10-22
\| \| \| \|	Add async evaluation to a number of ops.
*	Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate ↵	Rasmus Munk Larsen	2019-10-18
\| \| \| \|	c++11 functionality with older compilers.
*	Block evaluation for TensorGenerator/TensorReverse/TensorShuffling	Eugene Zhulenev	2019-10-14
\|
*	Block evaluation for TensorChipping + fixed bugs in TensorPadding and ↵	Eugene Zhulenev	2019-10-09
\| \| \| \|	TensorSlicing
*	Add block evaluation to TensorEvalTo and fix few small bugs	Eugene Zhulenev	2019-10-07
\|
*	Tensor block evaluation V2 support for unary/binary/broadcsting	Eugene Zhulenev	2019-09-24
\|
*	Allow move-only done callback in TensorAsyncDevice	Eugene Zhulenev	2019-09-03
\|
*	Fix block mapper type name in TensorExecutor	Eugene Zhulenev	2019-08-30
\|
*	evalSubExprsIfNeededAsync + async TensorContractionThreadPool	Eugene Zhulenev	2019-08-30
\|
*	Asynchronous expression evaluation with TensorAsyncDevice	Eugene Zhulenev	2019-08-30
\|
*	Allocate non-const scalar buffer for block evaluation with DefaultDevice	Eugene Zhulenev	2019-07-01
\|
*	Merge with Eigen head	Eugene Zhulenev	2019-06-28
\|\
* \|	Add block access to TensorReverseOp and make sure that TensorForcedEval uses ↵	Eugene Zhulenev	2019-06-28
\| \| \| \| \| \| \| \|	block access when preferred
\| *	[SYCL] This PR adds the minimum modifications to the Eigen unsupported ↵	Mehdi Goli	2019-06-28
\|/ \| \| \| \| \| \| \| \| \|	module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.
*	Prevent potential division by zero in TensorExecutor	Eugene Zhulenev	2019-05-17
\|
*	Always evaluate Tensor expressions with broadcasting via tiled evaluation ↵	Eugene Zhulenev	2019-05-16
\| \| \| \|	code path
*	Fix segfaults with cuda compilation	Eugene Zhulenev	2019-03-11
\|
*	Fix placement of "#if defined(EIGEN_GPUCC)" guard region.	Rasmus Munk Larsen	2019-03-06
\| \| \| \| \| \|	Found with -Wundefined-func-template. Author: tkoeppe@google.com
*	Fiw shadowing of last and all	Gael Guennebaud	2018-09-21
\|
*	Explicitly construct tensor block dimensions from evaluator dimensions	Eugene Zhulenev	2018-09-14
\|
*	Merge with upstream eigen/default	Eugene Zhulenev	2018-08-27
\|\
\| *	Fix several integer conversion and sign-compare warnings	Christoph Hertzberg	2018-08-24
\| \|
\| *	Removed an used variable (PacketSize) from TensorExecutor	Sameer Agarwal	2018-08-15
\| \|
\| *	Fixed more compilation errors	Benoit Steiner	2018-08-15
\| \|
* \|	Merge with eigen/default	Eugene Zhulenev	2018-08-10
\|\ \
\| \| *	Code cleanup	Benoit Steiner	2018-08-13
\| \| \|
\| \| *	Use NULL instead of nullptr to avoid adding a cxx11 requirement.	Benoit Steiner	2018-08-13
\| \|/
\| *	Avoided language features that are only available in cxx11 mode.	Benoit Steiner	2018-08-10
\| \|
* \|	Fix bug in a test + compilation errors	Eugene Zhulenev	2018-08-09
\| \|
* \|	Replace all using declarations with typedefs in Tensor ops	Eugene Zhulenev	2018-08-01
\|/
*	Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO.	Mehdi Goli	2018-08-01
\|
*	Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible	Eugene Zhulenev	2018-07-27
\|
*	Add tiled evaluation support to TensorExecutor	Eugene Zhulenev	2018-07-25
\|
*	Remove SimpleThreadPool and always use {NonBlocking}ThreadPool	Eugene Zhulenev	2018-07-16
\|
*	merging the CUDA and HIP implementation for the Tensor directory and the ↵	Deven Desai	2018-06-20
\| \| \| \|	unit tests
*	updates based on PR feedback	Deven Desai	2018-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
*	Adding support for using Eigen in HIP kernels.	Deven Desai	2018-06-06
\| \| \| \| \| \| \| \| \|	This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.