eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
...
* \| \| \|	Add ndtri function, the inverse of the normal distribution function.	Srinivas Vasudevan	2019-08-12
\| \| \| \|
\| \| \| *	PR 621: Fix documentation of EIGEN_COMP_EMSCRIPTEN	David Tellenbach	2019-03-21
\| \| \|/
\| \| *	Fix doc issues regarding ndtri	Gael Guennebaud	2019-09-04
\| \| \|
\| \| *	Fix possible warning regarding strict equality comparisons	Gael Guennebaud	2019-09-04
\| \| \|
\| \| *	PR 681: Add ndtri function, the inverse of the normal distribution function.	Srinivas Vasudevan	2019-08-12
\| \| \|
\| \| *	Change typedefs from private to protected to fix MSVC compilation	Eugene Zhulenev	2019-09-03
\| \| \|
\| \| *	Allow move-only done callback in TensorAsyncDevice	Eugene Zhulenev	2019-09-03
\| \|/
\| *	Add test for const TensorMap underlying data mutation	Eugene Zhulenev	2019-09-03
\| \|
\| *	TensorMap constness should not change underlying storage constness	Eugene Zhulenev	2019-09-03
\| \|
\| *	Makes Scalar/RealScalar typedefs public in Pardiso's wrappers (see PR 688)	Gael Guennebaud	2019-09-03
\| \|
\| *	Fixed Tensor documentation formatting.	Alberto Luaces	2019-07-23
\| \|
\| *	More colamd cleanup:	Gael Guennebaud	2019-09-03
\| \| \| \| \| \| \| \| \| \| \| \|	- Move colamd implementation in its own namespace to avoid polluting the internal namespace with Ok, Status, etc. - Fix signed/unsigned warning - move some ugly free functions as member functions
\| *	Eigen_Colamd.h updated to replace constexpr with consts and enums.	Anshul Jaiswal	2019-08-17
\| \|
\| *	Ordering.h edited to fix dependencies on Eigen_Colamd.h	Anshul Jaiswal	2019-08-15
\| \|
\| *	Eigen_Colamd.h edited replacing macros with constexprs and functions.	Anshul Jaiswal	2019-08-15
\| \|
\| *	Eigen_Colamd.h edited online with Bitbucket replacing constant #defines with ↵	Anshul Jaiswal	2019-07-21
\| \| \| \| \| \| \| \|	const definitions
\| *	Updated Eigen_Colamd.h, namespacing macros ALIVE & DEAD as COLAMD_ALIVE & ↵	Anshul Jaiswal	2019-06-08
\| \| \| \| \| \| \| \| \| \| \| \|	COLAMD_DEAD to prevent conflicts with other libraries / code.
\| *	Fix shadow warnings in TensorContractionThreadPool	Eugene Zhulenev	2019-08-30
\| \|
\| *	Fix block mapper type name in TensorExecutor	Eugene Zhulenev	2019-08-30
\| \|
\| *	evalSubExprsIfNeededAsync + async TensorContractionThreadPool	Eugene Zhulenev	2019-08-30
\| \|
\| *	Revert accidentally removed <memory> header from ThreadPool	Eugene Zhulenev	2019-08-30
\| \|
\| *	Asynchronous expression evaluation with TensorAsyncDevice	Eugene Zhulenev	2019-08-30
\| \|
\| *	Fix missing header inclusion and colliding definitions for half type ↵	Rasmus Munk Larsen	2019-08-30
\| \| \| \| \| \| \| \| \| \| \| \|	casting, which broke build with -march=native on Haswell/Skylake.
\| *	Const correctness in TensorMap<const Tensor<T, ...>> expressions	Eugene Zhulenev	2019-08-28
\| \|
\| *	Add more tests for corner cases of log1p and expm1. Add handling of infinite ↵	Rasmus Munk Larsen	2019-08-28
\| \| \| \| \| \| \| \|	arguments to log1p such that log1p(inf) = inf.
\| *	Remove shadow warnings in TensorDeviceThreadPool	Eugene Zhulenev	2019-08-28
\| \|
\| *	Revert changes to std_falback::log1p that broke handling of arguments less ↵	Rasmus Munk Larsen	2019-08-27
\| \| \| \| \| \| \| \|	than -1. Fix packet op accordingly.
\| *	Clean up float16 a.k.a. Eigen::half support in Eigen. Move the definition of ↵	Rasmus Munk Larsen	2019-08-27
\| \| \| \| \| \| \| \|	half to Core/arch/Default and move arch-specific packet ops to their respective sub-directories.
\| *	Merged in ezhulenev/eigen-01 (pull request PR-683)	Rasmus Larsen	2019-08-26
\| \|\ \| \| \| \| \| \| \| \| \|	Asynchronous parallelFor in Eigen ThreadPoolDevice
\| * \|	Fix get_random_seed on Native Client	maratek	2019-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Newlib in Native Client SDK does not provide ::random function. Implement get_random_seed for NaCl using ::rand, similarly to Windows version.
\| \| *	Asynchronous parallelFor in Eigen ThreadPoolDevice	Eugene Zhulenev	2019-08-22
\| \|/
\| *	Merged in jaopaulolc/eigen (pull request PR-679)	Christoph Hertzberg	2019-08-22
\| \|\ \| \| \| \| \| \| \| \| \|	Fixes for Altivec/VSX and compilation with clang on PowerPC
\| * \	Merged in rmlarsen/eigen (pull request PR-680)	Rasmus Larsen	2019-08-22
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments.
\| * \| \|	Remove XSMM support from Tensor module	Eugene Zhulenev	2019-08-19
\|/ / /
\| \| *	Fix debug macros in p{load,store}u	João P. L. de Carvalho	2019-08-14
\| \| \|
\| \| *	Add missing pcmp_XX methods for double/Packet2d	João P. L. de Carvalho	2019-08-14
\| \| \| \| \| \| \| \| \| \| \| \|	This actually fixes an issue in unit-test packetmath_2 with pcmp_eq when it is compiled with clang. When pcmp_eq(Packet4f,Packet4f) is used instead of pcmp_eq(Packet2d,Packet2d), the unit-test does not pass due to NaN on ref vector.
\| * \|	Implement vectorized versions of log1p and expm1 in Eigen using Kahan's ↵	Rasmus Munk Larsen	2019-08-12
\|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	formulas, and change the scalar implementations to properly handle infinite arguments. Depending on instruction set, significant speedups are observed for the vectorized path: log1p wall time is reduced 60-93% (2.5x - 15x speedup) expm1 wall time is reduced 0-85% (1x - 7x speedup) The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly. Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM
\| *	Fix packed load/store for PowerPC's VSX	João P. L. de Carvalho	2019-08-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The vec_vsx_ld/vec_vsx_st builtins were wrongly used for aligned load/store. In fact, they perform unaligned memory access and, even when the address is 16-byte aligned, they are much slower (at least 2x) than their aligned counterparts. For double/Packet2d vec_xl/vec_xst should be prefered over vec_ld/vec_st, although the latter works when casted to float/Packet4f. Silencing some weird warning with throw but some GCC versions. Such warning are not thrown by Clang.
\| *	Fix offset argument of ploadu/pstoreu for Altivec	João P. L. de Carvalho	2019-08-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If no offset is given, them it should be zero. Also passes full address to vec_vsx_ld/st builtins. Removes userless _EIGEN_ALIGNED_PTR & _EIGEN_MASK_ALIGNMENT. Removes unnecessary casts.
\| *	bug #1718: Add cast to successfully compile with clang on PowerPC	João P. L. de Carvalho	2019-08-09
\|/ \| \| \|	Ignoring -Wc11-extensions warnings thrown by clang at Altivec/PacketMath.h
*	Fix bugs in log1p and expm1 where repeated using statements would clobber ↵	Rasmus Munk Larsen	2019-08-08
\| \| \| \| \| \|	each other. Add specializations for complex types since std::log1p and std::exp1m do not support complex.
*	Guard against repeated definition of EIGEN_MPL2_ONLY	Rasmus Munk Larsen	2019-08-07
\|
*	Disable tests for contraction with output kernels when using libxsmm, which ↵	Rasmus Munk Larsen	2019-08-07
\| \| \| \|	does not support this.
*	[Eigen] Vectorize evaluation of coefficient-wise functions over tensor ↵	Rasmus Munk Larsen	2019-08-07
\| \| \| \| \| \| \| \| \| \| \| \|	blocks if the strides are known to be 1. Provides up to 20-25% speedup of the TF cross entropy op with AVX. A few benchmark numbers: name old time/op new time/op delta BM_Xent_16_10000_cpu 448µs ± 3% 389µs ± 2% -13.21% (p=0.008 n=5+5) BM_Xent_32_10000_cpu 575µs ± 6% 454µs ± 3% -21.00% (p=0.008 n=5+5) BM_Xent_64_10000_cpu 933µs ± 4% 712µs ± 1% -23.71% (p=0.008 n=5+5)
*	Clean up unnecessary namespace specifiers in TensorBlock.h.	Rasmus Munk Larsen	2019-08-07
\|
*	Fix doc regarding alignment and c++17	Gael Guennebaud	2019-08-04
\|
*	Fix performance regressions due to ↵	Rasmus Munk Larsen	2019-08-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://bitbucket.org/eigen/eigen/pull-requests/662. The change caused the device struct to be copied for each expression evaluation, and caused, e.g., a 10% regression in the TensorFlow multinomial op on GPU: Benchmark Time(ns) CPU(ns) Iterations ---------------------------------------------------------------------- BM_Multinomial_gpu_1_100000_4 128173 231326 2922 1.610G items/s VS Benchmark Time(ns) CPU(ns) Iterations ---------------------------------------------------------------------- BM_Multinomial_gpu_1_100000_4 146683 246914 2719 1.509G items/s
*	Added leading asterisk for Doxygen to consume as it was removing asterisk ↵	Kyle Vedder	2019-07-18
\| \| \| \|	intended to be part of the code.
*	Fix typo in Umeyama method documentation	Michael Grupp	2019-07-17
\|
*	Remove {} accidentally added in previous commit	Christoph Hertzberg	2019-07-18
\|