eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Don't crash when attempting to slice an empty tensor.	Rasmus Munk Larsen	2021-02-24
\|
*	Fix rule-of-3 for the Tensor module.	Antonio Sanchez	2020-11-18
\| \| \| \| \| \| \|	Adds copy constructors to Tensor ops, inherits assignment operators from `TensorBase`. Addresses #1863
*	Eigen moved the `scanLauncehr` function inside the internal namespace.	mehdi-goli	2020-05-11
\| \| \| \| \| \| \|	This commit applies the following changes: - Moving the `scamLauncher` specialization inside internal namespace to fix compiler crash on TensorScan for SYCL backend. - Replacing `SYCL/sycl.hpp` to `CL/sycl.hpp` in order to follow SYCL 1.2.1 standard. - minor fixes: commenting out an unused variable to avoid compiler warnings.
*	Extend support for Packet16b:	Rasmus Munk Larsen	2020-04-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add ptranspose<,4> to support matmul and add unit test for Matrix<bool> Matrix<bool> * work around a bug in slicing of Tensor<bool>. * Add tensor tests This speeds up matmul for boolean matrices by about 10x name old time/op new time/op delta BM_MatMul<bool>/8 267ns ± 0% 479ns ± 0% +79.25% (p=0.008 n=5+5) BM_MatMul<bool>/32 6.42µs ± 0% 0.87µs ± 0% -86.50% (p=0.008 n=5+5) BM_MatMul<bool>/64 43.3µs ± 0% 5.9µs ± 0% -86.42% (p=0.008 n=5+5) BM_MatMul<bool>/128 315µs ± 0% 44µs ± 0% -85.98% (p=0.008 n=5+5) BM_MatMul<bool>/256 2.41ms ± 0% 0.34ms ± 0% -85.68% (p=0.008 n=5+5) BM_MatMul<bool>/512 18.8ms ± 0% 2.7ms ± 0% -85.53% (p=0.008 n=5+5) BM_MatMul<bool>/1k 149ms ± 0% 22ms ± 0% -85.40% (p=0.008 n=5+5)
*	Add async evaluation support to TensorSlicingOp.	Eugene Zhulenev	2020-04-22
\| \| \|	Device::memcpy is not async-safe and might lead to deadlocks. Always evaluate slice expression in async mode.
*	Tensor block evaluation cost model	Eugene Zhulenev	2019-12-18
\|
*	Remove V2 suffix from TensorBlock	Eugene Zhulenev	2019-12-10
\|
*	Remove TensorBlock.h and old TensorBlock/BlockMapper	Eugene Zhulenev	2019-12-10
\|
*	Do not use std::vector in getResourceRequirements	Eugene Zhulenev	2019-12-09
\|
*	[SYCL] Rebasing the SYCL support branch on top of the Einge upstream master ↵	Mehdi Goli	2019-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake
*	Remove legacy block evaluation support	Eugene Zhulenev	2019-11-12
\|
*	Add block evaluation V2 to TensorAsyncExecutor.	Rasmus Munk Larsen	2019-10-22
\| \| \| \|	Add async evaluation to a number of ops.
*	Propagate block evaluation preference through rvalue tensor expressions	Eugene Zhulenev	2019-10-17
\|
*	Block evaluation for TensorGenerator/TensorReverse/TensorShuffling	Eugene Zhulenev	2019-10-14
\|
*	Block evaluation for TensorChipping + fixed bugs in TensorPadding and ↵	Eugene Zhulenev	2019-10-09
\| \| \| \|	TensorSlicing
*	Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect	Eugene Zhulenev	2019-10-02
\|
*	Tensor block evaluation V2 support for unary/binary/broadcsting	Eugene Zhulenev	2019-09-24
\|
*	Fix expression evaluation heuristic for TensorSliceOp	Eugene Zhulenev	2019-07-09
\|
*	[SYCL] This PR adds the minimum modifications to the Eigen unsupported ↵	Mehdi Goli	2019-06-28
\| \| \| \| \| \| \| \| \| \|	module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.
*	Optimize evaluation strategy for TensorSlicingOp and TensorChippingOp	Eugene Zhulenev	2019-06-25
\|
*	Move struct outside of method for C++03 compatibility.	Christoph Hertzberg	2018-10-02
\|
*	Fix bug in copy optimization in Tensor slicing.	Eugene Zhulenev	2018-09-28
\|
*	Const cast scalar pointer in TensorSlicingOp evaluator	Eugene Zhulenev	2018-09-14
\|
*	Fix compilation of tiled evaluation code with c++03	Eugene Zhulenev	2018-09-11
\|
*	Merge with upstream eigen/default	Eugene Zhulenev	2018-08-27
\|\
\| *	Fixed more sign-compare and type-limits warnings	Christoph Hertzberg	2018-08-24
\| \|
* \|	Merge with eigen/default	Eugene Zhulenev	2018-08-10
\|\\|
* \|	Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all ↵	Eugene Zhulenev	2018-08-10
\| \| \| \| \| \| \| \|	evaluators
* \|	Fix bug in a test + compilation errors	Eugene Zhulenev	2018-08-09
\| \|
* \|	Replace all using declarations with typedefs in Tensor ops	Eugene Zhulenev	2018-08-01
\| \|
* \|	Fix typo + get rid of redundant member variables for block sizes	Eugene Zhulenev	2018-08-01
\| \|
* \|	Merged latest changes from upstream/eigen	Eugene Zhulenev	2018-08-01
\|\\|
\| *	Enabling per device specialisation of packetsize.	Mehdi Goli	2018-08-01
\| \|
* \|	Add block evaluation support to TensorOps	Eugene Zhulenev	2018-07-31
\|/
*	Add tiled evaluation support to TensorExecutor	Eugene Zhulenev	2018-07-25
\|
*	Updates corresponding to the latest round of PR feedback	Deven Desai	2018-07-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.
*	Adding support for using Eigen in HIP kernels.	Deven Desai	2018-06-06
\| \| \| \| \| \| \| \| \|	This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
*	Enable RawAccess to tensor slices whenever possinle.	Benoit Steiner	2018-04-30
\| \| \| \|	Avoid 32-bit integer overflow in TensorSlicingOp
*	Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10)	Benoit Steiner	2017-06-28
\| \| \| \| \| \| \| \| \| \|	DataDependancy * Wrapping data type to the pointer class for sycl in non-terminal nodes; not having that breaks Tensorflow Conv2d code. * Applying Ronnan's Comments. * Applying benoit's comments
*	Adding non-deferrenciable pointer track for ComputeCpp backend; Adding ↵	Mehdi Goli	2017-01-19
\| \| \| \|	TensorConvolutionOp for ComputeCpp; fixing typos. modifying TensorDeviceSycl to use the LegacyPointer class.
*	Adding Tensor ReverseOp; TensorStriding; TensorConversionOp; Modifying ↵	Mehdi Goli	2017-01-16
\| \| \| \|	Tensor Contractsycl to be located in any place in the expression tree.
*	Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in ↵	Mehdi Goli	2016-12-01
\| \| \| \|	TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code.
*	Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl.	Mehdi Goli	2016-11-14
\|
*	Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing.	Mehdi Goli	2016-11-14
\|
*	Added missing EIGEN_DEVICE_FUNC	Benoit Steiner	2016-06-07
\|
*	Fixed compilation warning	Benoit Steiner	2016-06-01
\|
*	Reimplement clamp as a static function.	Benoit Steiner	2016-05-27
\|
*	Use NULL instead of nullptr to preserve the compatibility with cxx03	Benoit Steiner	2016-05-27
\|
*	Added a new operation to enable more powerful tensorindexing.	Benoit Steiner	2016-05-27
\|
*	Fixed compilation errors triggered by old versions of gcc	Benoit Steiner	2016-05-12
\|