aboutsummaryrefslogtreecommitdiffhomepage
path: root/unsupported/Eigen/CXX11/src/Tensor/TensorMorphing.h
Commit message (Collapse)AuthorAge
* Fix calls to device functions from host codeGravatar Nathan Luehr2021-05-11
|
* Don't crash when attempting to slice an empty tensor.Gravatar Rasmus Munk Larsen2021-02-24
|
* Fix rule-of-3 for the Tensor module.Gravatar Antonio Sanchez2020-11-18
| | | | | | | Adds copy constructors to Tensor ops, inherits assignment operators from `TensorBase`. Addresses #1863
* Eigen moved the `scanLauncehr` function inside the internal namespace.Gravatar mehdi-goli2020-05-11
| | | | | | | This commit applies the following changes: - Moving the `scamLauncher` specialization inside internal namespace to fix compiler crash on TensorScan for SYCL backend. - Replacing `SYCL/sycl.hpp` to `CL/sycl.hpp` in order to follow SYCL 1.2.1 standard. - minor fixes: commenting out an unused variable to avoid compiler warnings.
* Extend support for Packet16b:Gravatar Rasmus Munk Larsen2020-04-28
| | | | | | | | | | | | | | | | | * Add ptranspose<*,4> to support matmul and add unit test for Matrix<bool> * Matrix<bool> * work around a bug in slicing of Tensor<bool>. * Add tensor tests This speeds up matmul for boolean matrices by about 10x name old time/op new time/op delta BM_MatMul<bool>/8 267ns ± 0% 479ns ± 0% +79.25% (p=0.008 n=5+5) BM_MatMul<bool>/32 6.42µs ± 0% 0.87µs ± 0% -86.50% (p=0.008 n=5+5) BM_MatMul<bool>/64 43.3µs ± 0% 5.9µs ± 0% -86.42% (p=0.008 n=5+5) BM_MatMul<bool>/128 315µs ± 0% 44µs ± 0% -85.98% (p=0.008 n=5+5) BM_MatMul<bool>/256 2.41ms ± 0% 0.34ms ± 0% -85.68% (p=0.008 n=5+5) BM_MatMul<bool>/512 18.8ms ± 0% 2.7ms ± 0% -85.53% (p=0.008 n=5+5) BM_MatMul<bool>/1k 149ms ± 0% 22ms ± 0% -85.40% (p=0.008 n=5+5)
* Add async evaluation support to TensorSlicingOp.Gravatar Eugene Zhulenev2020-04-22
| | | Device::memcpy is not async-safe and might lead to deadlocks. Always evaluate slice expression in async mode.
* Tensor block evaluation cost modelGravatar Eugene Zhulenev2019-12-18
|
* Remove V2 suffix from TensorBlockGravatar Eugene Zhulenev2019-12-10
|
* Remove TensorBlock.h and old TensorBlock/BlockMapperGravatar Eugene Zhulenev2019-12-10
|
* Do not use std::vector in getResourceRequirementsGravatar Eugene Zhulenev2019-12-09
|
* [SYCL] Rebasing the SYCL support branch on top of the Einge upstream master ↵Gravatar Mehdi Goli2019-11-28
| | | | | | | | | | | | | | | | | | | | | | branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake
* Remove legacy block evaluation supportGravatar Eugene Zhulenev2019-11-12
|
* Add block evaluation V2 to TensorAsyncExecutor.Gravatar Rasmus Munk Larsen2019-10-22
| | | | Add async evaluation to a number of ops.
* Propagate block evaluation preference through rvalue tensor expressionsGravatar Eugene Zhulenev2019-10-17
|
* Block evaluation for TensorGenerator/TensorReverse/TensorShufflingGravatar Eugene Zhulenev2019-10-14
|
* Block evaluation for TensorChipping + fixed bugs in TensorPadding and ↵Gravatar Eugene Zhulenev2019-10-09
| | | | TensorSlicing
* Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelectGravatar Eugene Zhulenev2019-10-02
|
* Tensor block evaluation V2 support for unary/binary/broadcstingGravatar Eugene Zhulenev2019-09-24
|
* Fix expression evaluation heuristic for TensorSliceOpGravatar Eugene Zhulenev2019-07-09
|
* [SYCL] This PR adds the minimum modifications to the Eigen unsupported ↵Gravatar Mehdi Goli2019-06-28
| | | | | | | | | | module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.
* Optimize evaluation strategy for TensorSlicingOp and TensorChippingOpGravatar Eugene Zhulenev2019-06-25
|
* Move struct outside of method for C++03 compatibility.Gravatar Christoph Hertzberg2018-10-02
|
* Fix bug in copy optimization in Tensor slicing.Gravatar Eugene Zhulenev2018-09-28
|
* Const cast scalar pointer in TensorSlicingOp evaluatorGravatar Eugene Zhulenev2018-09-14
|
* Fix compilation of tiled evaluation code with c++03Gravatar Eugene Zhulenev2018-09-11
|
* Merge with upstream eigen/defaultGravatar Eugene Zhulenev2018-08-27
|\
| * Fixed more sign-compare and type-limits warningsGravatar Christoph Hertzberg2018-08-24
| |
* | Merge with eigen/defaultGravatar Eugene Zhulenev2018-08-10
|\|
* | Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all ↵Gravatar Eugene Zhulenev2018-08-10
| | | | | | | | evaluators
* | Fix bug in a test + compilation errorsGravatar Eugene Zhulenev2018-08-09
| |
* | Replace all using declarations with typedefs in Tensor opsGravatar Eugene Zhulenev2018-08-01
| |
* | Fix typo + get rid of redundant member variables for block sizesGravatar Eugene Zhulenev2018-08-01
| |
* | Merged latest changes from upstream/eigenGravatar Eugene Zhulenev2018-08-01
|\|
| * Enabling per device specialisation of packetsize.Gravatar Mehdi Goli2018-08-01
| |
* | Add block evaluation support to TensorOpsGravatar Eugene Zhulenev2018-07-31
|/
* Add tiled evaluation support to TensorExecutorGravatar Eugene Zhulenev2018-07-25
|
* Updates corresponding to the latest round of PR feedbackGravatar Deven Desai2018-07-11
| | | | | | | | | | | | | | The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.
* Adding support for using Eigen in HIP kernels.Gravatar Deven Desai2018-06-06
| | | | | | | | | This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
* Enable RawAccess to tensor slices whenever possinle.Gravatar Benoit Steiner2018-04-30
| | | | Avoid 32-bit integer overflow in TensorSlicingOp
* Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10)Gravatar Benoit Steiner2017-06-28
| | | | | | | | | | DataDependancy * Wrapping data type to the pointer class for sycl in non-terminal nodes; not having that breaks Tensorflow Conv2d code. * Applying Ronnan's Comments. * Applying benoit's comments
* Adding non-deferrenciable pointer track for ComputeCpp backend; Adding ↵Gravatar Mehdi Goli2017-01-19
| | | | TensorConvolutionOp for ComputeCpp; fixing typos. modifying TensorDeviceSycl to use the LegacyPointer class.
* Adding Tensor ReverseOp; TensorStriding; TensorConversionOp; Modifying ↵Gravatar Mehdi Goli2017-01-16
| | | | Tensor Contractsycl to be located in any place in the expression tree.
* Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in ↵Gravatar Mehdi Goli2016-12-01
| | | | TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code.
* Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl.Gravatar Mehdi Goli2016-11-14
|
* Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing.Gravatar Mehdi Goli2016-11-14
|
* Added missing EIGEN_DEVICE_FUNCGravatar Benoit Steiner2016-06-07
|
* Fixed compilation warningGravatar Benoit Steiner2016-06-01
|
* Reimplement clamp as a static function.Gravatar Benoit Steiner2016-05-27
|
* Use NULL instead of nullptr to preserve the compatibility with cxx03Gravatar Benoit Steiner2016-05-27
|
* Added a new operation to enable more powerful tensorindexing.Gravatar Benoit Steiner2016-05-27
|