aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/arch
Commit message (Collapse)AuthorAge
...
* | Specialized the pload1 packet primitive for Packet8f and Packet4d in order ↵Gravatar Benoit Steiner2014-03-26
| | | | | | | | to take advantage of the vbroadcastss and vbroadcastsd instructions whenever possible.
* | Merged latest updates from the parent branchGravatar Benoit Steiner2014-03-26
|\|
* | Vectorized the multiplication and division of complex numbers using AVX ↵Gravatar Benoit Steiner2014-03-26
| | | | | | | | instructions.
* | Used AVX instructions to vectorize the complex version of the pfirst and ↵Gravatar Benoit Steiner2014-03-26
| | | | | | | | | | | | ploaddup packet primitives. Silenced a few compilation warnings.
| * Implement new 1 packet x 8 gebp kernelGravatar Gael Guennebaud2014-03-26
| |
| * add pbroadcast2/4 generic intrinsicsGravatar Gael Guennebaud2014-03-26
| |
* | Use AVX instructions to vectorize pset1<Packet2cd>, pset1<Packet4cf>, ↵Gravatar Benoit Steiner2014-03-25
| | | | | | | | preverse<Packet2cd>, and preverse<Packet4cf>
* | Used AVX instructions to vectorize the predux_min<Packet8f>, ↵Gravatar Benoit Steiner2014-03-24
| | | | | | | | predux_min<Packet4d>, predux_max<Packet8f>, and predux_max<Packet4d> packet primitives.
* | Added support for FMA instructionsGravatar Benoit Steiner2014-02-24
| |
* | Added support for AVX to Eigen.Gravatar Benoit Steiner2014-01-29
| |
| * Revert previous change and introduce a new workaround regarding gcc ↵Gravatar Gael Guennebaud2014-03-20
| | | | | | | | | | | | | | generating a shufps instruction instead of the more efficient pshufd instruction. The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply. Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering.
| * Makes gcc to generate a pshufd instruction for pset1Gravatar Gael Guennebaud2014-03-20
|/
* Remove useless register keyword, and optimize predux_min/max for SSE4Gravatar Gael Guennebaud2014-01-25
|
* bug #677: fix usage of pld instrinsics for ccomplexesGravatar Gael Guennebaud2013-11-02
|
* Fix bug #677: compilation issue on arm64 which does not have the PLD instructionGravatar Gael Guennebaud2013-10-31
|
* fix a few "dead stores" warningsGravatar Gael Guennebaud2013-10-26
|
* Fix ploaddup and lin-spaced with AltiVec.Gravatar Gael Guennebaud2013-09-10
|
* typoGravatar Gael Guennebaud2013-08-19
|
* Fix bug #642: add vectorization of sqrt for doubles, and make sqrt really ↵Gravatar Gael Guennebaud2013-08-19
| | | | safe if EIGEN_FAST_MATH is disabled
* Fix bug #590: NEON Duplicate lane loadGravatar Simon Pilgrim2013-06-23
|
* Make psqrt works with numeric_limits<float>::minGravatar Gael Guennebaud2013-06-14
|
* Fix bug #613: psqrt was incorrect for small numbersGravatar Jeff Dean2013-06-13
|
* Fix bug #314: move remaining math functions from internal to numext namespaceGravatar Gael Guennebaud2013-06-10
|
* Fix bug #591: minor optimization in NEON vectorization supportGravatar Simon Pilgrim2013-06-10
|
* Add missing pconj specializationsGravatar Gael Guennebaud2013-05-17
|
* Add SSE4 min/max for integersGravatar Gael Guennebaud2013-03-20
|
* Fix SSE plog<float> to return -INF on 0Gravatar Gael Guennebaud2013-02-14
|
* Suppress annoying "may be used uninitialized in this function" warning with ↵Gravatar Gael Guennebaud2013-01-24
| | | | gcc >= 4.6
* fix warningGravatar Gael Guennebaud2012-08-01
|
* fix lower acceptable bound of SSE pexp for doubleGravatar Gael Guennebaud2012-07-31
|
* add SSE pexp function for double, make use of _mm_floor_p* for pexp with SSE4.1Gravatar Gael Guennebaud2012-07-27
|
* Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.Gravatar Benoit Jacob2012-07-13
|
* fix typoGravatar Konstantinos Margaritis2012-07-04
|
* fix NEON port, use vget_lane_*() instead of temporary variables (saves extraGravatar Konstantinos Margaritis2012-07-04
| | | | | load/store), following advice by Josh Bleecher Snyder <josharian@gmail.com>. Also implement pmadd() using vmla instead of nested padd/pmul.
* fix bug #475: .exp() now returns +inf when overflow occurs (SSE)Gravatar Gael Guennebaud2012-06-14
|
* ARM NEON supports multiply-accumulate instruction vmla, use that in pmadd().Gravatar kmargar2012-05-28
|
* Get rid of include directives inside namespace blocks (bug #339).Gravatar Jitse Niesen2012-04-15
|
* proper C++ castingGravatar Gael Guennebaud2012-01-31
|
* fix static inline versus inline static issues (the former is the correct order)Gravatar Gael Guennebaud2012-01-31
|
* Patches to support ARM NEON with Clang 3.0 and LLVM-GCCGravatar Marton Danoczy2011-11-04
|
* no commentGravatar Gael Guennebaud2011-09-21
|
* quick workaround of MSVC9' ICE in pset1Gravatar Gael Guennebaud2011-09-21
|
* NEON: fix plsetGravatar Gael Guennebaud2011-05-18
|
* NEON: fix ploaddupGravatar Gael Guennebaud2011-05-18
|
* fix compilation on ARM NEON (missing AlignedOnScalar)Gravatar Gael Guennebaud2011-05-06
|
* better fix for gcc 4.6.0 / ptrdiff_t, as suggested by BenoitGravatar Thomas Capricelli2011-05-05
|
* Fix compilation with gcc-4.6.0, patch provided by Anton Gladky ↵Gravatar Thomas Capricelli2011-05-05
| | | | | | <gladky.anton@gmail.com>, working on debian packaging.
* re-enable fast pset1-pstore by introducing a new higher level pstore1 functionGravatar Gael Guennebaud2011-03-02
|
* fix bug #203: revert to using _mm_set1_p[sd]Gravatar Benoit Jacob2011-02-28
|
* remove now-useless commentsGravatar Benoit Jacob2011-02-27
|