eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h	Mehdi Goli	2018-08-01
\|
*	Distinguishing between internal memory allocation/deallocation from explicit ↵	Mehdi Goli	2018-08-01
\| \| \| \|	user memory allocation/deallocation.
*	Merged in yuefengz/eigen (pull request PR-370)	Benoit Steiner	2018-07-31
\|\ \| \| \| \| \| \|	Use device's allocate function instead of internal::aligned_malloc.
* \	Merged in ezhulenev/eigen/tiling_3 (pull request PR-438)	Gael Guennebaud	2018-07-31
\|\ \ \| \| \| \| \| \| \| \| \|	Tiled tensor executor
* \| \|	Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See ↵	Gael Guennebaud	2018-07-31
\| \| \| \| \| \| \| \| \| \| \| \|	PR 437.
* \| \|	bug #1577: fix msvc compilation of unit test, msvc defines ptrdiff_t as long ↵	Gael Guennebaud	2018-07-30
\| \| \| \| \| \| \| \| \| \| \| \|	long
\| * \|	Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible	Eugene Zhulenev	2018-07-27
\| \| \|
\| * \|	Add tiled evaluation support to TensorExecutor	Eugene Zhulenev	2018-07-25
\| \| \|
* \| \|	bug #1578: Improve prefetching in matrix multiplication on MIPS.	Alexey Frunze	2018-07-24
\| \| \|
* \| \|	Fix two small typos in the documentation	Patrik Huber	2018-07-26
\| \| \|
* \| \|	Merged in rmlarsen/eigen1 (pull request PR-441)	Gael Guennebaud	2018-07-30
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \|	Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.
* \| \| \|	Re-enable FMA for fast sqrt functions	Mark D Ryan	2018-07-30
\| \| \| \|
* \| \| \|	Re-enable FMA for fast sqrt functions	Mark D Ryan	2018-07-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit re-enables the use of FMA for the FAST sqrt functions. Doing so improves the performance of both algorithms. The float32 version is now 88% the speed of the original function, while the double version is 90%.
\| * \| \|	Reduce the number of template specializations of classes related to tensor ↵	Rasmus Munk Larsen	2018-07-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	contraction to reduce binary size.
\| \| * \|	TensorBlockIO	Eugene Zhulenev	2018-07-23
\| \| \| \|
* \| \| \|	Fix AVX512 implementations of psqrt	Mark D Ryan	2018-06-25
\|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit fixes the AVX512 implementations of psqrt in the same way that 3ed67cb0bb4af65fbf243df598604a8c7630bf7d fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in 3ed67cb0bb4af65fbf243df598604a8c7630bf7d shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original.
* \| \|	Add pcast packet op for NEON.	Rasmus Munk Larsen	2018-07-26
\| \| \|
* \| \|	DIsable static assertions only when necessary and disable double-promotion ↵	Christoph Hertzberg	2018-07-26
\| \| \| \| \| \| \| \| \| \| \| \|	warnings in that case as well
* \| \|	fix warnings for doc-eigen-prerequisites	Christoph Hertzberg	2018-07-24
\| \| \|
* \| \|	Removed several shadowing types and use global Index typedef everywhere	Christoph Hertzberg	2018-07-25
\| \| \|
* \| \|	Rename variable which shadows class name	Christoph Hertzberg	2018-07-25
\| \| \|
* \| \|	Account for missing change on commit "Remove SimpleThreadPool and..."	Gustavo Lima Chaves	2018-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"... always use {NonBlocking}ThreadPool". It seems the non-blocking implementation was me the default/only one, but a reference to the old name was left unmodified. Fix that.
* \| \|	Fixed issue which made documentation not getting built anymore	Christoph Hertzberg	2018-07-24
\| \| \|
* \| \|	Allow to filter out build-error messages	Christoph Hertzberg	2018-07-24
\|/ /
* \|	Initial support of TensorBlock	Eugene Zhulenev	2018-07-20
\| \|
* \|	Merged in glchaves/eigen (pull request PR-433)	Gael Guennebaud	2018-07-23
\|\ \ \| \| \| \| \| \| \| \| \|	Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded block
* \| \|	fix typo	Gael Guennebaud	2018-07-23
\| \| \|
* \| \|	Add lastN shorcuts to seq/seqN.	Gael Guennebaud	2018-07-23
\| \| \|
\| * \|	Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded	Gustavo Lima Chaves	2018-07-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	block Builds configured without the -DEIGEN_TEST_CXX11=ON flag would fail right away without this, as this test seems to rely on those language features. The skip under compilation with MSVC was kept.
* \| \|	Disable type traits for stdlibc++ <= 4.9.3	Eugene Zhulenev	2018-07-20
\|/ /
* \|	Oopps, EIGEN_COMP_MSVC is not available before including Eigen.	Gael Guennebaud	2018-07-20
\| \|
* \|	Disable optimization for sparse_product unit test with MSVC 2013, otherwise ↵	Gael Guennebaud	2018-07-20
\| \| \| \| \| \| \| \|	it takes several hours to build.
* \|	PR430: Convert count to the reducer type in MeanReducer	Eugene Zhulenev	2018-07-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.
* \|	Pass by const ref.	Gael Guennebaud	2018-07-19
\| \|
* \|	Fix IsRelocatable without C++11	Gael Guennebaud	2018-07-19
\| \|
* \|	Fix determination of EIGEN_HAS_TYPE_TRAITS	Gael Guennebaud	2018-07-19
\| \|
* \|	Fix stupid error in Quaternion move ctor	Gael Guennebaud	2018-07-19
\| \|
* \|	bug #1558: fix a corner case in MINRES when both v_new and w_new vanish.	David Hyde	2018-07-08
\| \|
* \|	Reduce number of allocations in TensorContractionThreadPool.	Eugene Zhulenev	2018-07-16
\| \|
* \|	bug #1569: fix Tensor<half>::mean() on AVX with respective unit test.	Gael Guennebaud	2018-07-19
\| \|
* \|	Add MIPS changes missing from previous merge.	Alexey Frunze	2018-07-18
\| \|
* \|	Assert that no output kernel is defined for GPU contraction	Eugene Zhulenev	2018-07-18
\| \|
* \|	Disable type traits for GCC < 5.1.0	Eugene Zhulenev	2018-07-18
\| \|
* \|	Specify default output kernel for TensorContractionOp	Eugene Zhulenev	2018-07-18
\| \|
* \|	Add regression for bugs #1573 and #1575	Gael Guennebaud	2018-07-18
\| \|
* \|	bug #1432: fix conservativeResize for non-relocatable scalar types. For ↵	Gael Guennebaud	2018-07-18
\| \| \| \| \| \| \| \|	those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable
* \|	Generalize ScalarWithExceptions to a full non-copyable and trowing scalar ↵	Gael Guennebaud	2018-07-18
\| \| \| \| \| \| \| \|	type to be used in other unit tests.
* \|	bug #1575: fix regression introduced in bug #1573 patch. Move ↵	Gael Guennebaud	2018-07-18
\| \| \| \| \| \| \| \|	ctor/assignment should not be defaulted.
* \|	More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA	Gael Guennebaud	2018-07-18
\| \|
\| *	Use device's allocate function instead of internal::aligned_malloc. This ↵	Yuefeng Zhou	2018-02-20
\| \| \| \| \| \| \| \|	would make it easier to track memory usage in device instances.