eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	bug #1085: workaround gcc default ABI issue	Gael Guennebaud	2015-10-10
\|
*	_mm_hadd_epi32 is for SSSE3 only (and not SSE3)	Gael Guennebaud	2015-10-07
\|
*	Handle various TODOs in SSE vectorization (remove splitted storeu, enable ↵	Gael Guennebaud	2015-10-06
\| \| \| \|	SSE3 integer vectorization, plus minor tweaks)
*	bug #1069: fix AVX support on MSVC (use of non portable C-style cast)	Gael Guennebaud	2015-09-28
\|
*	Added support for predux_mul for CUDA devices	Benoit Steiner	2015-09-08
\|
*	Implement plog and pexp for AltiVec.	Doug Kwan	2015-07-30
\|
*	Fix prototype of plset and generalize linspace functor.	Gael Guennebaud	2015-08-07
\|
*	Include SSE packetmath when AVX is enabled, and enable AVX's sine function ↵	Gael Guennebaud	2015-08-07
\| \| \| \|	only in fast-math mode (as SSE)
*	Let unpacket_traits<> exposes the required alignment and make use of it ↵	Gael Guennebaud	2015-08-07
\| \| \| \|	everywhere
*	Fix shadow warnings triggered by clang	Gael Guennebaud	2015-06-09
\|
*	Abandon blocking size lookup table approach. Not performing as well in real ↵	Benoit Jacob	2015-05-19
\| \| \| \|	world as in microbenchmark.
*	also uninitialized here, see previous cset	Benoit Jacob	2015-05-15
\|
*	Fix uninitialized var warning. The compiler was clearing the register ↵	Benoit Jacob	2015-05-15
\| \| \| \|	anyway, so this does not change resulting code
*	Merged in doug_kwan/eigen (pull request PR-103)	Konstantinos Margaritis	2015-05-05
\|\ \| \| \| \| \| \|	Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of
* \|	Added a double-precision implementation of the exp() function for AVX.	Benoit Steiner	2015-05-04
\| \|
* \|	Pulled latest update from the eigen main codebase	Benoit Steiner	2015-03-24
\|\ \
\| * \|	Fixed the CUDA packet primitives	Benoit Steiner	2015-03-24
\| \| \|
\| * \|	use unsigned short instead of uint16_t which doesn't exist in c++98	Benoit Jacob	2015-03-17
\| \| \|
\| * \|	Update Nexus 5 lookup table from combining now 2 runs of the benchmark, ↵	Benoit Jacob	2015-03-16
\| \| \| \| \| \| \| \| \| \| \| \|	using the analyze-blocking-sizes partition tool. Gives better worst-case performance.
\| * \|	Provide a empirical lookup table for blocking sizes measured on a Nexus 5. ↵	Benoit Jacob	2015-03-15
\| \| \| \| \| \| \| \| \| \| \| \|	Only for float, only for Android on ARM 32bit for now.
\| \| *	Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of	Doug Kwan	2015-03-11
\| \|/ \| \| \| \| \| \|	doubles instead of swapping the doubles.
* \|	Fixed the optimized AVX implementation of the fast rsqrt function	Benoit Steiner	2015-03-02
\| \|
* \|	Added an optimized version of rsqrt for SSE and AVX that is used when ↵	Benoit Steiner	2015-03-02
\| \| \| \| \| \| \| \|	EIGEN_FAST_MATH is defined.
* \|	Pulled latest updates from trunk	Benoit Steiner	2015-02-27
\|\ \
* \| \|	Switch to truncated casting when converting floating point types to integer. ↵	Benoit Steiner	2015-02-27
\| \| \| \| \| \| \| \| \| \| \| \|	This ensures that vectorized casts are consistent with scalar casts
* \| \|	Added support for vectorized type casting of tensors	Benoit Steiner	2015-02-27
\| \| \|
* \| \|	Added support for fast reciprocal square root computation.	Benoit Steiner	2015-02-26
\| \| \|
\| \| *	must also disable complex<double> when disabling double vectorization	Benoit Jacob	2015-03-03
\| \| \|
\| \| *	Work around an ICE in Clang 3.5 in the iOS toolchain with double NEON ↵	Benoit Jacob	2015-03-03
\| \| \| \| \| \| \| \| \| \| \| \|	intrinsics.
\| \| *	HalfPacket also needed to be disabled for double, on ARMv8.	Benoit Jacob	2015-03-02
\| \|/
\| *	remove trailing comma	Benoit Jacob	2015-02-27
\| \|
\| *	Disable Packet2f/2i halfpacket support in NEON.	Benoit Jacob	2015-02-27
\|/ \| \| \| \| \|	I believe that it was erroneously turned on, since Packet2f/2i intrinsics are unimplemented, and code trying to use halfpackets just fails to compile on NEON, as it tries to use the default implementation of pload/pstore and the types don't match.
*	Marked the CUDA packet primitives as EIGEN_DEVICE_FUNC since they'll end up ↵	Benoit Steiner	2015-02-19
\| \| \| \|	being executed on the GPU device.
*	bug #955 - Implement a rotating kernel alternative in the 3px4 gebp path	Benoit Jacob	2015-02-18
\| \| \| \| \| \| \| \|	This is substantially faster on ARM, where it's important to minimize the number of loads. This is specific to the case where all packet types are of size 4. I made my best attempt to minimize how dirty this is... opinions welcome. Eventually one could have a generic rotated kernel, but it would take some work to get there. Also, on sandy bridge, in my experience, it's not beneficial (even about 1% slower).
*	Add missing install directives for arch/CUDA	Gael Guennebaud	2015-02-18
\|
*	Remove some dead stores.	Gael Guennebaud	2015-02-18
\|
*	Disable __m128* wrappers when compiling with AVX and -fabi-version=4	Gael Guennebaud	2015-02-17
\|
*	Fix compilation with GCC/AVX (workaround __m128 and __m256 being the same ↵	Gael Guennebaud	2015-02-17
\| \| \| \|	type with default ABI)
*	Merged in chtz/eigen-indexconversion (pull request PR-92)	Gael Guennebaud	2015-02-16
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug #877, bug #572: Get rid of Index conversion warnings, summary of changes: - Introduce a global typedef Eigen::Index making Eigen::DenseIndex and AnyExpr<>::Index deprecated (default is std::ptrdiff_t). - Eigen::Index is used throughout the API to represent indices, offsets, and sizes. - Classes storing an array of indices uses the type StorageIndex to store them. This is a template parameter of the class. Default is int. - Methods that explicitly set or return an element of such an array take or return a StorageIndex type. In all other cases, the Index type is used.
\| *	The usage of DenseIndex is deprecated, so let's replace DenseIndex by Index	Gael Guennebaud	2015-02-16
\| \|
* \|	Pulled latest updates from trunk	Benoit Steiner	2015-02-13
\|\\|
* \|	Optimized version of the sin(), exp(), log() and sqrt() function for AVX	Benoit Steiner	2015-02-13
\| \|
\| *	merge Tensor module within Eigen/unsupported and update gemv BLAS wrapper	Gael Guennebaud	2015-02-12
\|/\|
* \|	merge	Gael Guennebaud	2015-02-10
\|\ \
* \| \|	FMA has been wrongly disabled	Gael Guennebaud	2015-02-10
\| \| \|
\| * \|	Added vectorized implementation of the exponential function for ARM/NEON	Benoit Steiner	2015-02-10
\|/ /
\| *	Pulled the latest changes from the trunk	Benoit Steiner	2015-02-06
\| \|\ \| \|/ \|/\|
* \|	bug #936, patch 3/3: Properly detect FMA support on ARM (requires VFPv4)	Benoit Jacob	2015-01-30
\| \| \| \| \| \| \| \| \| \|	and use it instead of MLA when available, because it's both more accurate, and faster.
* \|	bug #936, patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with ↵	Benoit Jacob	2015-01-30
\| \| \| \| \| \| \| \|	EIGEN_HAS_SINGLE_INSTRUCTION_MADD
* \|	bug #936, patch 1.5/3: rename _FUSED_ macros to _SINGLE_INSTRUCTION_,	Benoit Jacob	2015-01-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	because this is what they are about. "Fused" means "no intermediate rounding between the mul and the add, only one rounding at the end". Instead, what we are concerned about here is whether a temporary register is needed, i.e. whether the MUL and ADD are separate instructions. Concretely, on ARM NEON, a single-instruction mul-add is always available: VMLA. But a true fused mul-add is only available on VFPv4: VFMA.