aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
...
* | | | Merged in codeplaysoftware/eigen-upstream-pure/new-arch-SYCL-headers (pull ↵Gravatar Benoit Steiner2018-08-01
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | request PR-448) Adding new arch/SYCL headers, used for SYCL vectorization.
* \ \ \ \ Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull ↵Gravatar Benoit Steiner2018-08-01
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | request PR-449) Enabling per device specialisation of packetSize.
* \ \ \ \ \ Merged in codeplaysoftware/eigen-upstream-pure/EIGEN_STRONG_INLINE_MACRO ↵Gravatar Benoit Steiner2018-08-01
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (pull request PR-445) Replacing ad-hoc inline keyword with EIGEN_STRONG_INLINE MACRO.
| | | | | * | Using the suggested modification.Gravatar Mehdi Goli2018-08-01
| | | | | | |
| | * | | | | Enabling per device specialisation of packetsize.Gravatar Mehdi Goli2018-08-01
| |/ / / / / |/| | | | |
| | * | | | Adding new arch/SYCL headers, used for SYCL vectorization.Gravatar Mehdi Goli2018-08-01
| |/ / / / |/| | | |
| | | * | variadic version of assert which can take a parameter pack as its input.Gravatar Mehdi Goli2018-08-01
| |_|/ / |/| | |
| | * | Distinguishing between internal memory allocation/deallocation from explicit ↵Gravatar Mehdi Goli2018-08-01
| |/ / |/| | | | | | | | user memory allocation/deallocation.
| * | Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO.Gravatar Mehdi Goli2018-08-01
|/ /
* | Merged in yuefengz/eigen (pull request PR-370)Gravatar Benoit Steiner2018-07-31
|\ \ | | | | | | | | | Use device's allocate function instead of internal::aligned_malloc.
| | * Change getAllocator() to allocator() in ThreadPoolDevice.Gravatar Paul Tucker2018-07-31
| | |
* | | Merged in ezhulenev/eigen/tiling_3 (pull request PR-438)Gravatar Gael Guennebaud2018-07-31
|\ \ \ | | | | | | | | | | | | Tiled tensor executor
* | | | Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See ↵Gravatar Gael Guennebaud2018-07-31
| | | | | | | | | | | | | | | | PR 437.
* | | | bug #1577: fix msvc compilation of unit test, msvc defines ptrdiff_t as long ↵Gravatar Gael Guennebaud2018-07-30
| | | | | | | | | | | | | | | | long
| * | | Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possibleGravatar Eugene Zhulenev2018-07-27
| | | |
| * | | Add tiled evaluation support to TensorExecutorGravatar Eugene Zhulenev2018-07-25
| | | |
* | | | bug #1578: Improve prefetching in matrix multiplication on MIPS.Gravatar Alexey Frunze2018-07-24
| | | |
* | | | Fix two small typos in the documentationGravatar Patrik Huber2018-07-26
| | | |
* | | | Merged in rmlarsen/eigen1 (pull request PR-441)Gravatar Gael Guennebaud2018-07-30
|\ \ \ \ | | | | | | | | | | | | | | | Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.
* | | | | Re-enable FMA for fast sqrt functionsGravatar Mark D Ryan2018-07-30
| | | | |
* | | | | Re-enable FMA for fast sqrt functionsGravatar Mark D Ryan2018-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit re-enables the use of FMA for the FAST sqrt functions. Doing so improves the performance of both algorithms. The float32 version is now 88% the speed of the original function, while the double version is 90%.
| * | | | Reduce the number of template specializations of classes related to tensor ↵Gravatar Rasmus Munk Larsen2018-07-27
| | | | | | | | | | | | | | | | | | | | contraction to reduce binary size.
| | * | | TensorBlockIOGravatar Eugene Zhulenev2018-07-23
| | | | |
| | | | * Add test coverage for ThreadPoolDevice optional allocator.Gravatar Paul Tucker2018-07-19
| | | | |
| | | | * Actually add optional Allocator* arg to ThreadPoolDevice().Gravatar Paul Tucker2018-07-16
| | | | |
| | | | * Add optional Allocator argument to ThreadPoolDevice constructor.Gravatar Paul Tucker2018-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When supplied, this allocator will be used in place of internal::aligned_malloc. This permits e.g. use of a NUMA-node specific allocator where the thread-pool is also restricted a single NUMA-node.
* | | | | Fix AVX512 implementations of psqrtGravatar Mark D Ryan2018-06-25
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit fixes the AVX512 implementations of psqrt in the same way that 3ed67cb0bb4af65fbf243df598604a8c7630bf7d fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in 3ed67cb0bb4af65fbf243df598604a8c7630bf7d shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original.
* | | | Add pcast packet op for NEON.Gravatar Rasmus Munk Larsen2018-07-26
| | | |
* | | | DIsable static assertions only when necessary and disable double-promotion ↵Gravatar Christoph Hertzberg2018-07-26
| | | | | | | | | | | | | | | | warnings in that case as well
* | | | fix warnings for doc-eigen-prerequisitesGravatar Christoph Hertzberg2018-07-24
| | | |
* | | | Removed several shadowing types and use global Index typedef everywhereGravatar Christoph Hertzberg2018-07-25
| | | |
* | | | Rename variable which shadows class nameGravatar Christoph Hertzberg2018-07-25
| | | |
* | | | Account for missing change on commit "Remove SimpleThreadPool and..."Gravatar Gustavo Lima Chaves2018-07-23
| | | | | | | | | | | | | | | | | | | | | | | | "... always use {NonBlocking}ThreadPool". It seems the non-blocking implementation was me the default/only one, but a reference to the old name was left unmodified. Fix that.
* | | | Fixed issue which made documentation not getting built anymoreGravatar Christoph Hertzberg2018-07-24
| | | |
* | | | Allow to filter out build-error messagesGravatar Christoph Hertzberg2018-07-24
|/ / /
* | | Initial support of TensorBlockGravatar Eugene Zhulenev2018-07-20
| | |
* | | Merged in glchaves/eigen (pull request PR-433)Gravatar Gael Guennebaud2018-07-23
|\ \ \ | | | | | | | | | | | | Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded block
* | | | fix typoGravatar Gael Guennebaud2018-07-23
| | | |
* | | | Add lastN shorcuts to seq/seqN.Gravatar Gael Guennebaud2018-07-23
| | | |
| * | | Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guardedGravatar Gustavo Lima Chaves2018-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | block Builds configured without the -DEIGEN_TEST_CXX11=ON flag would fail right away without this, as this test seems to rely on those language features. The skip under compilation with MSVC was kept.
* | | | Disable type traits for stdlibc++ <= 4.9.3Gravatar Eugene Zhulenev2018-07-20
|/ / /
* | | Oopps, EIGEN_COMP_MSVC is not available before including Eigen.Gravatar Gael Guennebaud2018-07-20
| | |
* | | Disable optimization for sparse_product unit test with MSVC 2013, otherwise ↵Gravatar Gael Guennebaud2018-07-20
| | | | | | | | | | | | it takes several hours to build.
* | | PR430: Convert count to the reducer type in MeanReducerGravatar Eugene Zhulenev2018-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.
* | | Pass by const ref.Gravatar Gael Guennebaud2018-07-19
| | |
* | | Fix IsRelocatable without C++11Gravatar Gael Guennebaud2018-07-19
| | |
* | | Fix determination of EIGEN_HAS_TYPE_TRAITSGravatar Gael Guennebaud2018-07-19
| | |
* | | Fix stupid error in Quaternion move ctorGravatar Gael Guennebaud2018-07-19
| | |
* | | bug #1558: fix a corner case in MINRES when both v_new and w_new vanish.Gravatar David Hyde2018-07-08
| | |
* | | Reduce number of allocations in TensorContractionThreadPool.Gravatar Eugene Zhulenev2018-07-16
| | |