Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Fail at compile time if default executor tries to use non-default device | Eugene Zhulenev | 2020-02-06 |
| | |||
* | Tensor block evaluation cost model | Eugene Zhulenev | 2019-12-18 |
| | |||
* | Reduce block evaluation overhead for small tensor expressions | Eugene Zhulenev | 2019-12-17 |
| | |||
* | Add back accidentally deleted default constructor to ↵ | Eugene Zhulenev | 2019-12-11 |
| | | | | TensorExecutorTilingContext. | ||
* | Remove block memory allocation required by removed block evaluation API | Eugene Zhulenev | 2019-12-10 |
| | |||
* | Remove V2 suffix from TensorBlock | Eugene Zhulenev | 2019-12-10 |
| | |||
* | Remove TensorBlock.h and old TensorBlock/BlockMapper | Eugene Zhulenev | 2019-12-10 |
| | |||
* | Do not use std::vector in getResourceRequirements | Eugene Zhulenev | 2019-12-09 |
| | |||
* | Use EIGEN_DEVICE_FUNC macro instead of __device__. | Rasmus Munk Larsen | 2019-12-03 |
| | |||
* | [SYCL] Rebasing the SYCL support branch on top of the Einge upstream master ↵ | Mehdi Goli | 2019-11-28 |
| | | | | | | | | | | | | | | | | | | | | | | branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake | ||
* | Remove legacy block evaluation support | Eugene Zhulenev | 2019-11-12 |
| | |||
* | Fix a race in async tensor evaluation: Don't run on_done() until after ↵ | Rasmus Munk Larsen | 2019-11-11 |
| | | | | device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs. | ||
* | Prevent potential ODR in TensorExecutor | Eugene Zhulenev | 2019-10-28 |
| | |||
* | Add block evaluation V2 to TensorAsyncExecutor. | Rasmus Munk Larsen | 2019-10-22 |
| | | | | Add async evaluation to a number of ops. | ||
* | Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate ↵ | Rasmus Munk Larsen | 2019-10-18 |
| | | | | c++11 functionality with older compilers. | ||
* | Block evaluation for TensorGenerator/TensorReverse/TensorShuffling | Eugene Zhulenev | 2019-10-14 |
| | |||
* | Block evaluation for TensorChipping + fixed bugs in TensorPadding and ↵ | Eugene Zhulenev | 2019-10-09 |
| | | | | TensorSlicing | ||
* | Add block evaluation to TensorEvalTo and fix few small bugs | Eugene Zhulenev | 2019-10-07 |
| | |||
* | Tensor block evaluation V2 support for unary/binary/broadcsting | Eugene Zhulenev | 2019-09-24 |
| | |||
* | Allow move-only done callback in TensorAsyncDevice | Eugene Zhulenev | 2019-09-03 |
| | |||
* | Fix block mapper type name in TensorExecutor | Eugene Zhulenev | 2019-08-30 |
| | |||
* | evalSubExprsIfNeededAsync + async TensorContractionThreadPool | Eugene Zhulenev | 2019-08-30 |
| | |||
* | Asynchronous expression evaluation with TensorAsyncDevice | Eugene Zhulenev | 2019-08-30 |
| | |||
* | Allocate non-const scalar buffer for block evaluation with DefaultDevice | Eugene Zhulenev | 2019-07-01 |
| | |||
* | Merge with Eigen head | Eugene Zhulenev | 2019-06-28 |
|\ | |||
* | | Add block access to TensorReverseOp and make sure that TensorForcedEval uses ↵ | Eugene Zhulenev | 2019-06-28 |
| | | | | | | | | block access when preferred | ||
| * | [SYCL] This PR adds the minimum modifications to the Eigen unsupported ↵ | Mehdi Goli | 2019-06-28 |
|/ | | | | | | | | | | module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes. | ||
* | Prevent potential division by zero in TensorExecutor | Eugene Zhulenev | 2019-05-17 |
| | |||
* | Always evaluate Tensor expressions with broadcasting via tiled evaluation ↵ | Eugene Zhulenev | 2019-05-16 |
| | | | | code path | ||
* | Fix segfaults with cuda compilation | Eugene Zhulenev | 2019-03-11 |
| | |||
* | Fix placement of "#if defined(EIGEN_GPUCC)" guard region. | Rasmus Munk Larsen | 2019-03-06 |
| | | | | | | Found with -Wundefined-func-template. Author: tkoeppe@google.com | ||
* | Fiw shadowing of last and all | Gael Guennebaud | 2018-09-21 |
| | |||
* | Explicitly construct tensor block dimensions from evaluator dimensions | Eugene Zhulenev | 2018-09-14 |
| | |||
* | Merge with upstream eigen/default | Eugene Zhulenev | 2018-08-27 |
|\ | |||
| * | Fix several integer conversion and sign-compare warnings | Christoph Hertzberg | 2018-08-24 |
| | | |||
| * | Removed an used variable (PacketSize) from TensorExecutor | Sameer Agarwal | 2018-08-15 |
| | | |||
| * | Fixed more compilation errors | Benoit Steiner | 2018-08-15 |
| | | |||
* | | Merge with eigen/default | Eugene Zhulenev | 2018-08-10 |
|\ \ | |||
| | * | Code cleanup | Benoit Steiner | 2018-08-13 |
| | | | |||
| | * | Use NULL instead of nullptr to avoid adding a cxx11 requirement. | Benoit Steiner | 2018-08-13 |
| |/ | |||
| * | Avoided language features that are only available in cxx11 mode. | Benoit Steiner | 2018-08-10 |
| | | |||
* | | Fix bug in a test + compilation errors | Eugene Zhulenev | 2018-08-09 |
| | | |||
* | | Replace all using declarations with typedefs in Tensor ops | Eugene Zhulenev | 2018-08-01 |
|/ | |||
* | Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO. | Mehdi Goli | 2018-08-01 |
| | |||
* | Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible | Eugene Zhulenev | 2018-07-27 |
| | |||
* | Add tiled evaluation support to TensorExecutor | Eugene Zhulenev | 2018-07-25 |
| | |||
* | Remove SimpleThreadPool and always use {NonBlocking}ThreadPool | Eugene Zhulenev | 2018-07-16 |
| | |||
* | merging the CUDA and HIP implementation for the Tensor directory and the ↵ | Deven Desai | 2018-06-20 |
| | | | | unit tests | ||
* | updates based on PR feedback | Deven Desai | 2018-06-14 |
| | | | | | | | | | | | | | | | | | There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)` | ||
* | Adding support for using Eigen in HIP kernels. | Deven Desai | 2018-06-06 |
| | | | | | | | | | This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests. |