Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | updates based on PR feedback | 2018-06-14 | |
| | | | | | | | | | | | | | | | | | There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)` | ||
* | Fix typos found using codespell | 2018-06-07 | |
| | |||
* | Hyperlink DOIs against preferred resolver | 2018-05-24 | |
| | |||
* | Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH ↵ | 2017-07-17 | |
| | | | | aliases | ||
* | Adding Tensor ReverseOp; TensorStriding; TensorConversionOp; Modifying ↵ | 2017-01-16 | |
| | | | | Tensor Contractsycl to be located in any place in the expression tree. | ||
* | Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in ↵ | 2016-12-01 | |
| | | | | TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code. | ||
* | Fixing LLVM error on TensorMorphingSycl.h on GPU; fixing int64_t crash for ↵ | 2016-11-25 | |
| | | | | tensor_broadcast_sycl on GPU; adding get_sycl_supported_devices() on syclDevice.h. | ||
* | Emulate _BitScanReverse64 for 32 bits builds | 2016-07-11 | |
| | |||
* | Change runtime to compile-time conditional. | 2016-07-08 | |
| | |||
* | Fixed the integer division code on windows | 2016-03-09 | |
| | |||
* | Fixed the computation of leading zeros when compiling with msvc. | 2016-03-04 | |
| | |||
* | Fixed a typo | 2016-03-04 | |
| | |||
* | Updated the TensorIntDivisor code to work properly on LLP64 systems | 2016-02-08 | |
| | |||
* | Fixed the implementation of Eigen::internal::count_leading_zeros for MSVC. | 2015-11-23 | |
| | | | | Also updated the code to silence bogux warnings generated by nvcc when compilining this function. | ||
* | Added proper support for fast 64bit integer division on CUDA | 2015-11-20 | |
| | |||
* | Avoid using the version of TensorIntDiv optimized for 32-bit integers when ↵ | 2015-11-18 | |
| | | | | the divisor can be equal to one since it isn't supported. | ||
* | Fix some trivial warnings | 2015-08-19 | |
| | |||
* | Fixed 2 compilation warnings generated by llvm | 2015-07-29 | |
| | |||
* | Fixed a few compilation warnings triggered by clang | 2015-07-29 | |
| | |||
* | Simplified and generalized the DividerTraits code | 2015-07-29 | |
| | |||
* | Add missing specialization of struct DividerTraits<long> | 2015-07-29 | |
| | |||
* | Extended the range of value inputs for TensorIntDiv to support tensors with ↵ | 2015-07-22 | |
| | | | | more than 4 billion elements. | ||
* | Fixed a bug in the integer division code that caused some large numerators ↵ | 2015-07-13 | |
| | | | | to be incorrectly handled | ||
* | Improved and cleaned up the 2d patch extraction code | 2015-07-07 | |
| | |||
* | Fix undefined behavior. | 2015-06-19 | |
| | |||
* | Added new version of the TensorIntDiv class optimized for 32 bit signed ↵ | 2015-05-19 | |
| | | | | integers. It saves 1 register on CPU and 2 on GPU. | ||
* | Fixed incorrect assertion | 2015-02-28 | |
| | |||
* | Fixed another batch of compilation warnings | 2015-02-28 | |
| | |||
* | Fixed another compilation problem with TensorIntDiv.h | 2015-02-26 | |
| | |||
* | Made TensorIntDiv.h compile with MSVC | 2015-02-25 | |
| | |||
* | Fixed another clang warning | 2015-02-25 | |
| | |||
* | Misc improvements and cleanups | 2014-10-13 | |
| | |||
* | Added support for fast integer divisions by a constant | 2014-08-14 | |
Sped up tensor slicing by a factor of 3 by using these fast integer divisions. |