Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Merge with eigen/default | 2018-08-10 | |
|\ | |||
* | | Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all ↵ | 2018-08-10 | |
| | | | | | | | | evaluators | ||
| * | Avoided language features that are only available in cxx11 mode. | 2018-08-10 | |
| | | |||
| * | Made the code compile with gcc 5.4. | 2018-08-10 | |
| | | |||
* | | Fix bug in a test + compilation errors | 2018-08-09 | |
| | | |||
* | | Merged with upstream eigen | 2018-08-08 | |
|\| | |||
| * | Merged in codeplaysoftware/eigen-upstream-pure/Fixing_compiler_warning (pull ↵ | 2018-08-08 | |
| |\ | | | | | | | | | | | | | | | | request PR-462) Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation. | ||
| | * | Fixing compiler warning in TensorBlock.h as it was creating a lot of noise ↵ | 2018-08-08 | |
| | | | | | | | | | | | | at compilation. | ||
| * | | Fix init order. | 2018-08-07 | |
| |/ | |||
| * | Silenced a couple of compilation warnings. | 2018-08-06 | |
| | | |||
| * | Fixed compilation errors. | 2018-08-06 | |
| | | |||
| * | Forward declare NoOpOutputKernel as struct rather than class to be ↵ | 2018-08-06 | |
| | | | | | | | | consistent with implementation. | ||
* | | Replace all using declarations with typedefs in Tensor ops | 2018-08-01 | |
| | | |||
| * | Fix initialization order. | 2018-08-03 | |
| | | |||
| * | Fixing the compilation error. | 2018-08-03 | |
| | | |||
| * | Creating separate SYCL required PR for uncontroversial files. | 2018-08-03 | |
| | | |||
* | | Fix typo + get rid of redundant member variables for block sizes | 2018-08-01 | |
| | | |||
| * | Merged in paultucker/eigen (pull request PR-431) | 2018-08-01 | |
| |\ | | | | | | | | | | | | | | | | Optional ThreadPoolDevice allocator Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com> | ||
* | | | Merged latest changes from upstream/eigen | 2018-08-01 | |
|\| | | |||
| * | | Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull ↵ | 2018-08-01 | |
| |\ \ | | | | | | | | | | | | | | | | | | | | | request PR-447) Adding variadic version of assert which can take a parameter pack as its input. | ||
| * \ \ | Merged in ↵ | 2018-08-01 | |
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446) Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation. | ||
| | * | | | Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h | 2018-08-01 | |
| | | | | | |||
| * | | | | Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull ↵ | 2018-08-01 | |
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | request PR-449) Enabling per device specialisation of packetSize. | ||
| | | | * | | Using the suggested modification. | 2018-08-01 | |
| | | | | | | |||
| | * | | | | Enabling per device specialisation of packetsize. | 2018-08-01 | |
| | | | | | | |||
| | | | * | | variadic version of assert which can take a parameter pack as its input. | 2018-08-01 | |
| | | |/ / | | |/| | | |||
| | | * | | Distinguishing between internal memory allocation/deallocation from explicit ↵ | 2018-08-01 | |
| | |/ / | | | | | | | | | | | | | user memory allocation/deallocation. | ||
| * / / | Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO. | 2018-08-01 | |
| |/ / | |||
* | | | Add block evaluation support to TensorOps | 2018-07-31 | |
| | | | |||
| * | | Merged in yuefengz/eigen (pull request PR-370) | 2018-07-31 | |
| |\ \ | | | | | | | | | | | | | Use device's allocate function instead of internal::aligned_malloc. | ||
| | | * | Change getAllocator() to allocator() in ThreadPoolDevice. | 2018-07-31 | |
| | | | | |||
| * | | | Merged in ezhulenev/eigen/tiling_3 (pull request PR-438) | 2018-07-31 | |
| |\ \ \ | |/ / / |/| | | | | | | | Tiled tensor executor | ||
| * | | | Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See ↵ | 2018-07-31 | |
| | | | | | | | | | | | | | | | | PR 437. | ||
* | | | | Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible | 2018-07-27 | |
| | | | | |||
* | | | | Add tiled evaluation support to TensorExecutor | 2018-07-25 | |
| | | | | |||
| * | | | Reduce the number of template specializations of classes related to tensor ↵ | 2018-07-27 | |
| | | | | | | | | | | | | | | | | contraction to reduce binary size. | ||
* | | | | TensorBlockIO | 2018-07-23 | |
|/ / / | |||
* | | | Initial support of TensorBlock | 2018-07-20 | |
| | | | |||
| | * | Add test coverage for ThreadPoolDevice optional allocator. | 2018-07-19 | |
| | | | |||
* | | | PR430: Convert count to the reducer type in MeanReducer | 2018-07-19 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising. | ||
| | * | Actually add optional Allocator* arg to ThreadPoolDevice(). | 2018-07-16 | |
| | | | |||
| | * | Add optional Allocator argument to ThreadPoolDevice constructor. | 2018-07-16 | |
| | | | | | | | | | | | | | | | | | | When supplied, this allocator will be used in place of internal::aligned_malloc. This permits e.g. use of a NUMA-node specific allocator where the thread-pool is also restricted a single NUMA-node. | ||
* | | | Reduce number of allocations in TensorContractionThreadPool. | 2018-07-16 | |
| | | | |||
* | | | bug #1569: fix Tensor<half>::mean() on AVX with respective unit test. | 2018-07-19 | |
| | | | |||
* | | | Assert that no output kernel is defined for GPU contraction | 2018-07-18 | |
| | | | |||
* | | | Specify default output kernel for TensorContractionOp | 2018-07-18 | |
| | | | |||
| * | | Use device's allocate function instead of internal::aligned_malloc. This ↵ | 2018-02-20 | |
| | | | | | | | | | | | | would make it easier to track memory usage in device instances. | ||
* | | | Added a move constructor and move assignment operator to Tensor and wrote ↵ | 2018-02-07 | |
| | | | | | | | | | | | | some tests. | ||
* | | | Fix TensorContractionOp evaluators for GPU and SYCL | 2018-07-17 | |
| | | | |||
* | | | applying EIGEN_DECLARE_TEST to *gpu* tests | 2018-07-17 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also, a few minor fixes for GPU tests running in HIP mode. 1. Adding an include for hip/hip_runtime.h in the Macros.h file For HIP __host__ and __device__ are macros which are defined in hip headers. Their definitions need to be included before their use in the file. 2. Fixing the compile failure in TensorContractionGpu introduced by the commit to "Fuse computations into the Tensor contractions using output kernel" 3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit |