aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/arch/AltiVec/PacketMath.h
Commit message (Collapse)AuthorAge
* Implement plog and pexp for AltiVec.Gravatar Doug Kwan2015-07-30
|
* Fix prototype of plset and generalize linspace functor.Gravatar Gael Guennebaud2015-08-07
|
* Let unpacket_traits<> exposes the required alignment and make use of it ↵Gravatar Gael Guennebaud2015-08-07
| | | | everywhere
* The usage of DenseIndex is deprecated, so let's replace DenseIndex by IndexGravatar Gael Guennebaud2015-02-16
|
* bug #936, patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with ↵Gravatar Benoit Jacob2015-01-30
| | | | EIGEN_HAS_SINGLE_INSTRUCTION_MADD
* bug #936, patch 1.5/3: rename _FUSED_ macros to _SINGLE_INSTRUCTION_,Gravatar Benoit Jacob2015-01-31
| | | | | | | | | because this is what they are about. "Fused" means "no intermediate rounding between the mul and the add, only one rounding at the end". Instead, what we are concerned about here is whether a temporary register is needed, i.e. whether the MUL and ADD are separate instructions. Concretely, on ARM NEON, a single-instruction mul-add is always available: VMLA. But a true fused mul-add is only available on VFPv4: VFMA.
* bug #936, patch 1/3: some cleanup and renaming for consistency.Gravatar Benoit Jacob2015-01-30
|
* fixed to make big-endian VSX work as wellGravatar Konstantinos Margaritis2014-10-01
|
* prefetch are noops on VSX, actually disable the prefetch traitGravatar Konstantinos Margaritis2014-09-21
|
* fix compile error on big endian altivecGravatar Konstantinos Margaritis2014-09-21
|
* prefetch are noops on VSXGravatar Konstantinos Margaritis2014-09-21
|
* VSX supports vec_div, implement where appropriate (float/doubles)Gravatar Konstantinos Margaritis2014-09-21
|
* VSX port passes packetmath_[1-5] tests!Gravatar Konstantinos Margaritis2014-09-20
|
* 32-bit floats/ints, 64-bit doubles pass packetmath tests, complex 32/64-bit ↵Gravatar Konstantinos Margaritis2014-09-19
| | | | remaining
* First time it compiles, but fails to pass the tests.Gravatar Konstantinos Margaritis2014-09-09
|
* Initial VSX commitGravatar Konstantinos Margaritis2014-08-29
|
* Added HasDiv=1 to Altivec PacketMath.h, now vectorization_logic test passes.Gravatar Konstantinos Margaritis2014-07-15
| | | | Added comments to the constants, indicative of the actual values
* Fix many long to int implicit conversionsGravatar Gael Guennebaud2014-07-08
|
* Implement pbroadcast4 on altivecGravatar Gael Guennebaud2014-04-25
|
* Enable vectorization of pack_rhs with a column-major RHS.Gravatar Gael Guennebaud2014-04-25
| | | | Rename and generalize Kernel<*> to PacketBlock<*,N>.
* Enable fused madd for AltivecGravatar Gael Guennebaud2014-04-24
|
* Implement ptranspose on altivec and fix pgather/pscatterGravatar Gael Guennebaud2014-04-24
|
* Add Altivec implementation of pgather/pscatter (not tested)Gravatar Gael Guennebaud2014-04-23
|
* New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge ↵Gravatar Gael Guennebaud2014-04-16
| | | | | | speeup on Haswell. This changeset also introduce new vector functions: ploadquad and predux4.
* Add a mechanism to recursively access to half-size packet typesGravatar Gael Guennebaud2014-03-28
|
* Fix ploaddup and lin-spaced with AltiVec.Gravatar Gael Guennebaud2013-09-10
|
* Add missing pconj specializationsGravatar Gael Guennebaud2013-05-17
|
* Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.Gravatar Benoit Jacob2012-07-13
|
* Get rid of include directives inside namespace blocks (bug #339).Gravatar Jitse Niesen2012-04-15
|
* fix static inline versus inline static issues (the former is the correct order)Gravatar Gael Guennebaud2012-01-31
|
* better fix for gcc 4.6.0 / ptrdiff_t, as suggested by BenoitGravatar Thomas Capricelli2011-05-05
|
* Fix compilation with gcc-4.6.0, patch provided by Anton Gladky ↵Gravatar Thomas Capricelli2011-05-05
| | | | | | <gladky.anton@gmail.com>, working on debian packaging.
* fix AltiVec ploaddupGravatar Gael Guennebaud2011-02-24
|
* implement ploaddup for altivec and add respective unit testGravatar Gael Guennebaud2011-02-23
|
* Remove all references to EIGEN_TUNE_CPU_CACHE_SIZE.Gravatar Jitse Niesen2011-02-04
| | | | | This macro is no longer used as of revision 0212eec23f4cb64e8426bf32568156df302f8fcf .
* bug #86 : use internal:: namespace instead of ei_ prefixGravatar Benoit Jacob2010-10-25
|
* mixing types in product step 2:Gravatar Gael Guennebaud2010-07-11
| | | | | | | | * pload* and pset1 are now templated on the packet type * gemv routines are now embeded into a structure with a consistent API with respect to gemm * some configurations of vector * matrix and matrix * matrix works fine, some need more work...
* syncGravatar Gael Guennebaud2010-07-10
|\
| * forgot to commit ei_p4f_FORWARD;Gravatar Konstantinos Margaritis2010-07-09
| |
* | scalars fitting in a single packet requires more work, step 1Gravatar Gael Guennebaud2010-07-08
|/ | | | | * add a, Alignable trait * update LinearVectorization assignment
* s/IsVectorized/VectorizableGravatar Gael Guennebaud2010-07-07
|
* * add a IsVectorized mechanism (instead of packet-size>1...)Gravatar Gael Guennebaud2010-07-06
| | | | * vectorize complex<double>
* AltiVec signed integer pmadd removed, proved to be 2x slower than the scalar ↵Gravatar Konstantinos Margaritis2010-06-28
| | | | trait(!).
* Add a proof concept API to configure the blocking parameters at runtime.Gravatar Gael Guennebaud2010-06-07
| | | | After validation of the final API I'll update the other products to use it.
* (proper commit this time)Gravatar Konstantinos Margaritis2010-04-24
| | | | | | | replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function. Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h. Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch(). NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
* Backed out changeset 6972c140f737874d88da0e225c7c27b4563a4518Gravatar Konstantinos Margaritis2010-04-24
|
* replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() ↵Gravatar oem2010-04-24
| | | | | | | | inline function. Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h. Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch(). NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
* fix copy pasted commentGravatar Gael Guennebaud2010-03-05
|
* Altivec brought up to date. Most tests pass and performance is better than ↵Gravatar Konstantinos Margaritis2010-03-05
| | | | before too!
* Added initial NEON support, most tests pass however we had to use some ↵Gravatar Konstantinos Margaritis2010-03-03
| | | | | | | | hackish workarounds as gcc on ARM (both CodeSourcery 4.4.1 used and experimental 4.5) fail to ensure proper alignment with __attribute__((aligned(16))). This has to be fixed upstream to remove the workarounds.