eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	bug #1195: move NumTraits::Div<>::Cost to internal::scalar_div_cost (with ↵	Gael Guennebaud	2016-09-08
\| \| \| \|	some specializations in arch/SSE and arch/AVX)
*	Implement pmadd for float and double to make it consistent with the ↵	Gael Guennebaud	2016-08-23
\| \| \| \|	vectorized path when FMA is available.
*	Remove now-unused protate PacketMath func	Benoit Jacob	2016-05-24
\|
*	Optimized implementation of the tanh function for SSE	Benoit Steiner	2016-02-10
\|
*	Remove custom unaligned loads for SSE. They were only useful for core2 CPU.	Gael Guennebaud	2016-02-08
\|
*	Fix "," in non SSE4 mode	Gael Guennebaud	2015-11-05
\|
*	Add round, ceil and floor for SSE4.1/AVX (Bug #70)	Alexandre Avenel	2015-11-01
\|
*	bug #1085: workaround gcc default ABI issue	Gael Guennebaud	2015-10-10
\|
*	_mm_hadd_epi32 is for SSSE3 only (and not SSE3)	Gael Guennebaud	2015-10-07
\|
*	Handle various TODOs in SSE vectorization (remove splitted storeu, enable ↵	Gael Guennebaud	2015-10-06
\| \| \| \|	SSE3 integer vectorization, plus minor tweaks)
*	Fix prototype of plset and generalize linspace functor.	Gael Guennebaud	2015-08-07
\|
*	Let unpacket_traits<> exposes the required alignment and make use of it ↵	Gael Guennebaud	2015-08-07
\| \| \| \|	everywhere
*	Added support for fast reciprocal square root computation.	Benoit Steiner	2015-02-26
\|
*	bug #955 - Implement a rotating kernel alternative in the 3px4 gebp path	Benoit Jacob	2015-02-18
\| \| \| \| \| \| \| \|	This is substantially faster on ARM, where it's important to minimize the number of loads. This is specific to the case where all packet types are of size 4. I made my best attempt to minimize how dirty this is... opinions welcome. Eventually one could have a generic rotated kernel, but it would take some work to get there. Also, on sandy bridge, in my experience, it's not beneficial (even about 1% slower).
*	Disable __m128* wrappers when compiling with AVX and -fabi-version=4	Gael Guennebaud	2015-02-17
\|
*	Fix compilation with GCC/AVX (workaround __m128 and __m256 being the same ↵	Gael Guennebaud	2015-02-17
\| \| \| \|	type with default ABI)
*	The usage of DenseIndex is deprecated, so let's replace DenseIndex by Index	Gael Guennebaud	2015-02-16
\|
*	merge Tensor module within Eigen/unsupported and update gemv BLAS wrapper	Gael Guennebaud	2015-02-12
\|\
* \|	FMA has been wrongly disabled	Gael Guennebaud	2015-02-10
\| \|
\| *	Pulled the latest changes from the trunk	Benoit Steiner	2015-02-06
\| \|\ \| \|/ \|/\|
* \|	bug #936, patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with ↵	Benoit Jacob	2015-01-30
\| \| \| \| \| \| \| \|	EIGEN_HAS_SINGLE_INSTRUCTION_MADD
* \|	bug #936, patch 1.5/3: rename _FUSED_ macros to _SINGLE_INSTRUCTION_,	Benoit Jacob	2015-01-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	because this is what they are about. "Fused" means "no intermediate rounding between the mul and the add, only one rounding at the end". Instead, what we are concerned about here is whether a temporary register is needed, i.e. whether the MUL and ADD are separate instructions. Concretely, on ARM NEON, a single-instruction mul-add is always available: VMLA. But a true fused mul-add is only available on VFPv4: VFMA.
* \|	Introduce unified macros to identify compiler, OS, and architecture. They ↵	Gael Guennebaud	2014-11-04
\| \| \| \| \| \| \| \|	are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively.
\| *	Pulled in the latest changes from the Eigen trunk	Benoit Steiner	2014-08-13
\| \|\ \| \|/ \|/\|
* \|	Fix many long to int implicit conversions	Gael Guennebaud	2014-07-08
\| \|
\| *	Created the pblend packet primitive and implemented it using SSE and AVX ↵	Benoit Steiner	2014-06-06
\|/ \| \| \|	instructions.
*	Make sure that calls to broadcast4 are 16 bytes aligned	Gael Guennebaud	2014-04-25
\|
*	Enable vectorization of pack_rhs with a column-major RHS.	Gael Guennebaud	2014-04-25
\| \| \| \|	Rename and generalize Kernel<> to PacketBlock<,N>.
*	Enable fused madd for Altivec	Gael Guennebaud	2014-04-24
\|
*	Workaround gcc's default ABI not being able to distinghish between vector ↵	Gael Guennebaud	2014-04-22
\| \| \| \|	types of different sizes.
*	New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge ↵	Gael Guennebaud	2014-04-16
\| \| \| \| \| \|	speeup on Haswell. This changeset also introduce new vector functions: ploadquad and predux4.
*	Optimized SSE unaligned loads and stores when compiling a 64bit target with ↵	Benoit Steiner	2014-04-14
\| \| \| \|	a recent version of gcc (ie gcc 4.8).
*	Add a mechanism to recursively access to half-size packet types	Gael Guennebaud	2014-03-28
\|
*	Implemented the SSE version of the gather and scatter packet primitives.	Benoit Steiner	2014-03-27
\|
*	Created the ptranspose packet primitive that can transpose an array of N ↵	Benoit Steiner	2014-03-26
\| \| \| \| \| \|	packets, where N is the number of words in each packet. This primitive will be used to complete the vectorization of the gemm_pack_lhs and gemm_pack_rhs functions. Implemented the primitive using SSE instructions.
*	Merged latest updates from the parent branch	Benoit Steiner	2014-03-26
\|\
\| *	Implement new 1 packet x 8 gebp kernel	Gael Guennebaud	2014-03-26
\| \|
\| *	add pbroadcast2/4 generic intrinsics	Gael Guennebaud	2014-03-26
\| \|
* \|	Added support for FMA instructions	Benoit Steiner	2014-02-24
\| \|
* \|	Added support for AVX to Eigen.	Benoit Steiner	2014-01-29
\| \|
\| *	Revert previous change and introduce a new workaround regarding gcc ↵	Gael Guennebaud	2014-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	generating a shufps instruction instead of the more efficient pshufd instruction. The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply. Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering.
\| *	Makes gcc to generate a pshufd instruction for pset1	Gael Guennebaud	2014-03-20
\|/
*	Remove useless register keyword, and optimize predux_min/max for SSE4	Gael Guennebaud	2014-01-25
\|
*	Fix bug #642: add vectorization of sqrt for doubles, and make sqrt really ↵	Gael Guennebaud	2013-08-19
\| \| \| \|	safe if EIGEN_FAST_MATH is disabled
*	Add missing pconj specializations	Gael Guennebaud	2013-05-17
\|
*	Add SSE4 min/max for integers	Gael Guennebaud	2013-03-20
\|
*	add SSE pexp function for double, make use of _mm_floor_p* for pexp with SSE4.1	Gael Guennebaud	2012-07-27
\|
*	Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.	Benoit Jacob	2012-07-13
\|
*	Get rid of include directives inside namespace blocks (bug #339).	Jitse Niesen	2012-04-15
\|
*	proper C++ casting	Gael Guennebaud	2012-01-31
\|