Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Add support for thread local support on platforms that do not support it ↵ | 2018-08-13 | |
| | | | | through emulation using a hash map. | ||
* | Merged in rmlarsen/eigen2 (pull request PR-466) | 2018-08-13 | |
|\ | | | | | | | Move sigmoid functor to core and rename it to 'logistic'. | ||
| * | Call logistic functor from Tensor::sigmoid. | 2018-08-13 | |
| | | |||
* | | Use NULL instead of nullptr to avoid adding a cxx11 requirement. | 2018-08-13 | |
| | | |||
* | | Don't use the auto keyword since it's not always supported properly. | 2018-08-13 | |
| | | |||
* | | Fixed syntax of nested templates chevrons to make it compatible with c++97 mode. | 2018-08-13 | |
| | | |||
* | | Avoided language features that are only available in cxx11 mode. | 2018-08-10 | |
| | | |||
* | | Made the code compile with gcc 5.4. | 2018-08-10 | |
| | | |||
* | | Merged in codeplaysoftware/eigen-upstream-pure/Fixing_compiler_warning (pull ↵ | 2018-08-08 | |
|\ \ | | | | | | | | | | | | | | | | request PR-462) Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation. | ||
| * | | Fixing compiler warning in TensorBlock.h as it was creating a lot of noise ↵ | 2018-08-08 | |
| | | | | | | | | | | | | at compilation. | ||
* | | | Fix init order. | 2018-08-07 | |
|/ / | |||
* | | Silenced a couple of compilation warnings. | 2018-08-06 | |
| | | |||
* | | Fixed compilation errors. | 2018-08-06 | |
| | | |||
* | | Forward declare NoOpOutputKernel as struct rather than class to be ↵ | 2018-08-06 | |
| | | | | | | | | consistent with implementation. | ||
| * | Move sigmoid functor to core. | 2018-08-03 | |
|/ | |||
* | bug #1451: fix numeric_limits<AutoDiffScalar<Der>> with a reference as ↵ | 2018-08-04 | |
| | | | | derivative type | ||
* | Fix initialization order. | 2018-08-03 | |
| | |||
* | Fixing the compilation error. | 2018-08-03 | |
| | |||
* | Creating separate SYCL required PR for uncontroversial files. | 2018-08-03 | |
| | |||
* | Merged in paultucker/eigen (pull request PR-431) | 2018-08-01 | |
|\ | | | | | | | | | | | Optional ThreadPoolDevice allocator Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com> | ||
* \ | Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull ↵ | 2018-08-01 | |
|\ \ | | | | | | | | | | | | | | | | request PR-447) Adding variadic version of assert which can take a parameter pack as its input. | ||
* \ \ | Merged in ↵ | 2018-08-01 | |
|\ \ \ | | | | | | | | | | | | | | | | | | | | | codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446) Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation. | ||
| * | | | Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h | 2018-08-01 | |
| | | | | |||
* | | | | Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull ↵ | 2018-08-01 | |
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | request PR-449) Enabling per device specialisation of packetSize. | ||
| | | * | | Using the suggested modification. | 2018-08-01 | |
| | | | | | |||
| * | | | | Enabling per device specialisation of packetsize. | 2018-08-01 | |
| | | | | | |||
| | | * | | variadic version of assert which can take a parameter pack as its input. | 2018-08-01 | |
| | |/ / | |/| | | |||
| | * | | Distinguishing between internal memory allocation/deallocation from explicit ↵ | 2018-08-01 | |
| |/ / | | | | | | | | | | user memory allocation/deallocation. | ||
* / / | Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO. | 2018-08-01 | |
|/ / | |||
* | | Merged in yuefengz/eigen (pull request PR-370) | 2018-07-31 | |
|\ \ | | | | | | | | | | Use device's allocate function instead of internal::aligned_malloc. | ||
| | * | Change getAllocator() to allocator() in ThreadPoolDevice. | 2018-07-31 | |
| | | | |||
* | | | Merged in ezhulenev/eigen/tiling_3 (pull request PR-438) | 2018-07-31 | |
|\ \ \ | | | | | | | | | | | | | Tiled tensor executor | ||
* | | | | Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See ↵ | 2018-07-31 | |
| | | | | | | | | | | | | | | | | PR 437. | ||
| * | | | Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible | 2018-07-27 | |
| | | | | |||
| * | | | Add tiled evaluation support to TensorExecutor | 2018-07-25 | |
| | | | | |||
* | | | | Reduce the number of template specializations of classes related to tensor ↵ | 2018-07-27 | |
| | | | | | | | | | | | | | | | | contraction to reduce binary size. | ||
* | | | | Removed several shadowing types and use global Index typedef everywhere | 2018-07-25 | |
| | | | | |||
* | | | | Rename variable which shadows class name | 2018-07-25 | |
| | | | | |||
| * | | | TensorBlockIO | 2018-07-23 | |
|/ / / | |||
* | | | Initial support of TensorBlock | 2018-07-20 | |
| | | | |||
| | * | Add test coverage for ThreadPoolDevice optional allocator. | 2018-07-19 | |
| | | | |||
* | | | PR430: Convert count to the reducer type in MeanReducer | 2018-07-19 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising. | ||
| | * | Actually add optional Allocator* arg to ThreadPoolDevice(). | 2018-07-16 | |
| | | | |||
| | * | Add optional Allocator argument to ThreadPoolDevice constructor. | 2018-07-16 | |
| | | | | | | | | | | | | | | | | | | When supplied, this allocator will be used in place of internal::aligned_malloc. This permits e.g. use of a NUMA-node specific allocator where the thread-pool is also restricted a single NUMA-node. | ||
* | | | bug #1558: fix a corner case in MINRES when both v_new and w_new vanish. | 2018-07-08 | |
| | | | |||
* | | | Reduce number of allocations in TensorContractionThreadPool. | 2018-07-16 | |
| | | | |||
* | | | bug #1569: fix Tensor<half>::mean() on AVX with respective unit test. | 2018-07-19 | |
| | | | |||
* | | | Assert that no output kernel is defined for GPU contraction | 2018-07-18 | |
| | | | |||
* | | | Specify default output kernel for TensorContractionOp | 2018-07-18 | |
| | | | |||
| * | | Use device's allocate function instead of internal::aligned_malloc. This ↵ | 2018-02-20 | |
| | | | | | | | | | | | | would make it easier to track memory usage in device instances. |