eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	HasExp added for AVX512 Packet8d	Jakub Lichman	2021-04-20
\|
*	Fix pfrexp/pldexp for half.	Antonio Sanchez	2021-01-21
\| \| \| \| \| \| \| \| \| \|	The recent addition of vectorized pow (!330) relies on `pfrexp` and `pldexp`. This was missing for `Eigen::half` and `Eigen::bfloat16`. Adding tests for these packet ops also exposed an issue with handling negative values in `pfrexp`, returning an incorrect exponent. Added the missing implementations, corrected the exponent in `pfrexp1`, and added `packetmath` tests.
*	Add log2() to Eigen.	Rasmus Munk Larsen	2020-12-04
\|
*	Revert "Add log2() operator to Eigen"	Rasmus Munk Larsen	2020-12-03
\| \| \| \|	This reverts commit 4d91519a9be061da5d300079fca17dd0b9328050.
*	Add log2() operator to Eigen	Rasmus Munk Larsen	2020-12-03
\|
*	Fix a few issues for AVX512. This change enables vectorized versions of log, ↵	Rasmus Munk Larsen	2020-12-01
\| \| \| \|	exp, log1p, expm1 when AVX512DQ is not available.
*	AVX512 missing ops.	Antonio Sanchez	2020-11-30
\| \| \| \| \| \| \| \| \| \|	This allows the `packetmath` tests to pass for AVX512 on skylake. Made `half` and `bfloat16` consistent in terms of ops they support. Note the `log` tests are currently disabled for `bfloat16` since they fail due to poor precision (they were previously disabled for `Packet8bf` via test function specialization -- I just removed that specialization and disabled it in the generic test).
*	Improve polynomial evaluation with instruction-level parallelism for ↵	guoqiangqi	2020-10-20
\| \| \| \|	pexp_float and pexp<Packet16f>
*	Add AVX plog<Packet4d> and AVX512 plog<Packet8d> ops,also unified AVX512 ↵	Guoqiang QI	2020-10-15
\| \| \| \|	plog<Packet16f> op with generic api
*	AVX path for BF16	Sheng Yang	2020-07-14
\|
*	Support BFloat16 in Eigen	Teng Lu	2020-06-20
\|
*	Fix for gcc build error when using Eigen headers with AVX512	Anuj Rawat	2020-01-10
\|
*	1. Fix a bug in psqrt and make it return 0 for +inf arguments.	Rasmus Munk Larsen	2019-11-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	2. Simplify handling of special cases by taking advantage of the fact that the builtin vrsqrt approximation handles negative, zero and +inf arguments correctly. This speeds up the SSE and AVX implementations by ~20%. 3. Make the Newton-Raphson formula used for rsqrt more numerically robust: Before: y = y * (1.5 - x/2 * y^2) After: y = y * (1.5 - y * (x/2) * y) Forming y^2 can overflow for very large or very small (denormalized) values of x, while x*y ~= 1. For AVX512, this makes it possible to compute accurate results for denormal inputs down to ~1e-42 in single precision. 4. Add a faster double precision implementation for Knights Landing using the vrsqrt28 instruction and a single Newton-Raphson iteration. Benchmark results: https://bitbucket.org/snippets/rmlarsen/5LBq9o
*	bug #1744: fix compilation with MSVC 2017 and AVX512, plog1p/pexpm1 require ↵	Gael Guennebaud	2019-11-15
\| \| \| \|	plog/pexp, but the later was disabled on some compilers
*	PR 751: Fixed compilation issue when compiling using MSVC with /arch:AVX512 flag	Sakshi Goynar	2019-10-31
\|
*	Move implementation of vectorized error function erf() to ↵	Rasmus Munk Larsen	2019-09-27
\| \| \| \|	SpecialFunctionsImpl.h.
*	Add generic PacketMath implementation of the Error Function (erf).	Rasmus Munk Larsen	2019-09-19
\|
*	Implement vectorized versions of log1p and expm1 in Eigen using Kahan's ↵	Rasmus Munk Larsen	2019-08-12
\| \| \| \| \| \| \| \| \| \| \| \|	formulas, and change the scalar implementations to properly handle infinite arguments. Depending on instruction set, significant speedups are observed for the vectorized path: log1p wall time is reduced 60-93% (2.5x - 15x speedup) expm1 wall time is reduced 0-85% (1x - 7x speedup) The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly. Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM
*	fix plog(+inf) with AVX512	Gael Guennebaud	2019-01-09
\|
*	Add psin/pcos on AVX512 -> almost for free, at last!	Gael Guennebaud	2018-11-30
\|
*	Fix float-to-double warning	Gael Guennebaud	2018-10-16
\|
*	Fix avx512 plog(NaN) to return NaN instead of +inf	Gael Guennebaud	2018-10-11
\|
*	Enable avx512 plog with clang	Gael Guennebaud	2018-10-11
\|
*	Re-enable FMA for fast sqrt functions	Mark D Ryan	2018-07-30
\|
*	Fix AVX512 implementations of psqrt	Mark D Ryan	2018-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \|	This commit fixes the AVX512 implementations of psqrt in the same way that 3ed67cb0bb4af65fbf243df598604a8c7630bf7d fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in 3ed67cb0bb4af65fbf243df598604a8c7630bf7d shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original.
*	fix AVX512 plog	Jayaram Bobba	2018-04-20
\|
*	AVX512: _mm512_rsqrt28_ps is available for AVX512ER only	Gael Guennebaud	2018-04-03
\|
*	AVX512: fix psqrt and prsqrt	Gael Guennebaud	2018-04-03
\|
*	Disabled some of the AVX512 primitives on compilers that don't support them	Benoit Steiner	2016-04-29
\|
*	Commented out the version of pexp<Packet8d> since it fails to compile with ↵	Benoit Steiner	2016-02-04
\| \| \| \|	gcc 5.3
*	Added implementations of pexp, plog, psqrt, and prsqrt optimized for AVX512	Benoit Steiner	2016-02-04