Commit message (Collapse) | Author | Age | ||
---|---|---|---|---|
... | ||||
* | | Specialized the pload1 packet primitive for Packet8f and Packet4d in order ↵ | 2014-03-26 | ||
| | | | | | | | | to take advantage of the vbroadcastss and vbroadcastsd instructions whenever possible. | |||
* | | Merged latest updates from the parent branch | 2014-03-26 | ||
|\| | ||||
* | | Vectorized the multiplication and division of complex numbers using AVX ↵ | 2014-03-26 | ||
| | | | | | | | | instructions. | |||
* | | Used AVX instructions to vectorize the complex version of the pfirst and ↵ | 2014-03-26 | ||
| | | | | | | | | | | | | ploaddup packet primitives. Silenced a few compilation warnings. | |||
| * | Implement new 1 packet x 8 gebp kernel | 2014-03-26 | ||
| | | ||||
| * | add pbroadcast2/4 generic intrinsics | 2014-03-26 | ||
| | | ||||
* | | Use AVX instructions to vectorize pset1<Packet2cd>, pset1<Packet4cf>, ↵ | 2014-03-25 | ||
| | | | | | | | | preverse<Packet2cd>, and preverse<Packet4cf> | |||
* | | Used AVX instructions to vectorize the predux_min<Packet8f>, ↵ | 2014-03-24 | ||
| | | | | | | | | predux_min<Packet4d>, predux_max<Packet8f>, and predux_max<Packet4d> packet primitives. | |||
* | | Added support for FMA instructions | 2014-02-24 | ||
| | | ||||
* | | Added support for AVX to Eigen. | 2014-01-29 | ||
| | | ||||
| * | Revert previous change and introduce a new workaround regarding gcc ↵ | 2014-03-20 | ||
| | | | | | | | | | | | | | | generating a shufps instruction instead of the more efficient pshufd instruction. The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply. Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering. | |||
| * | Makes gcc to generate a pshufd instruction for pset1 | 2014-03-20 | ||
|/ | ||||
* | Remove useless register keyword, and optimize predux_min/max for SSE4 | 2014-01-25 | ||
| | ||||
* | bug #677: fix usage of pld instrinsics for ccomplexes | 2013-11-02 | ||
| | ||||
* | Fix bug #677: compilation issue on arm64 which does not have the PLD instruction | 2013-10-31 | ||
| | ||||
* | fix a few "dead stores" warnings | 2013-10-26 | ||
| | ||||
* | Fix ploaddup and lin-spaced with AltiVec. | 2013-09-10 | ||
| | ||||
* | typo | 2013-08-19 | ||
| | ||||
* | Fix bug #642: add vectorization of sqrt for doubles, and make sqrt really ↵ | 2013-08-19 | ||
| | | | | safe if EIGEN_FAST_MATH is disabled | |||
* | Fix bug #590: NEON Duplicate lane load | 2013-06-23 | ||
| | ||||
* | Make psqrt works with numeric_limits<float>::min | 2013-06-14 | ||
| | ||||
* | Fix bug #613: psqrt was incorrect for small numbers | 2013-06-13 | ||
| | ||||
* | Fix bug #314: move remaining math functions from internal to numext namespace | 2013-06-10 | ||
| | ||||
* | Fix bug #591: minor optimization in NEON vectorization support | 2013-06-10 | ||
| | ||||
* | Add missing pconj specializations | 2013-05-17 | ||
| | ||||
* | Add SSE4 min/max for integers | 2013-03-20 | ||
| | ||||
* | Fix SSE plog<float> to return -INF on 0 | 2013-02-14 | ||
| | ||||
* | Suppress annoying "may be used uninitialized in this function" warning with ↵ | 2013-01-24 | ||
| | | | | gcc >= 4.6 | |||
* | fix warning | 2012-08-01 | ||
| | ||||
* | fix lower acceptable bound of SSE pexp for double | 2012-07-31 | ||
| | ||||
* | add SSE pexp function for double, make use of _mm_floor_p* for pexp with SSE4.1 | 2012-07-27 | ||
| | ||||
* | Automatic relicensing to MPL2 using Keirs script. Manual fixup follows. | 2012-07-13 | ||
| | ||||
* | fix typo | 2012-07-04 | ||
| | ||||
* | fix NEON port, use vget_lane_*() instead of temporary variables (saves extra | 2012-07-04 | ||
| | | | | | load/store), following advice by Josh Bleecher Snyder <josharian@gmail.com>. Also implement pmadd() using vmla instead of nested padd/pmul. | |||
* | fix bug #475: .exp() now returns +inf when overflow occurs (SSE) | 2012-06-14 | ||
| | ||||
* | ARM NEON supports multiply-accumulate instruction vmla, use that in pmadd(). | 2012-05-28 | ||
| | ||||
* | Get rid of include directives inside namespace blocks (bug #339). | 2012-04-15 | ||
| | ||||
* | proper C++ casting | 2012-01-31 | ||
| | ||||
* | fix static inline versus inline static issues (the former is the correct order) | 2012-01-31 | ||
| | ||||
* | Patches to support ARM NEON with Clang 3.0 and LLVM-GCC | 2011-11-04 | ||
| | ||||
* | no comment | 2011-09-21 | ||
| | ||||
* | quick workaround of MSVC9' ICE in pset1 | 2011-09-21 | ||
| | ||||
* | NEON: fix plset | 2011-05-18 | ||
| | ||||
* | NEON: fix ploaddup | 2011-05-18 | ||
| | ||||
* | fix compilation on ARM NEON (missing AlignedOnScalar) | 2011-05-06 | ||
| | ||||
* | better fix for gcc 4.6.0 / ptrdiff_t, as suggested by Benoit | 2011-05-05 | ||
| | ||||
* | Fix compilation with gcc-4.6.0, patch provided by Anton Gladky ↵ | 2011-05-05 | ||
| | | | | | | <gladky.anton@gmail.com>, working on debian packaging. | |||
* | re-enable fast pset1-pstore by introducing a new higher level pstore1 function | 2011-03-02 | ||
| | ||||
* | fix bug #203: revert to using _mm_set1_p[sd] | 2011-02-28 | ||
| | ||||
* | remove now-useless comments | 2011-02-27 | ||
| |