aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/arch/AltiVec/PacketMath.h
Commit message (Collapse)AuthorAge
* First time it compiles, but fails to pass the tests.Gravatar Konstantinos Margaritis2014-09-09
|
* Initial VSX commitGravatar Konstantinos Margaritis2014-08-29
|
* Added HasDiv=1 to Altivec PacketMath.h, now vectorization_logic test passes.Gravatar Konstantinos Margaritis2014-07-15
| | | | Added comments to the constants, indicative of the actual values
* Fix many long to int implicit conversionsGravatar Gael Guennebaud2014-07-08
|
* Implement pbroadcast4 on altivecGravatar Gael Guennebaud2014-04-25
|
* Enable vectorization of pack_rhs with a column-major RHS.Gravatar Gael Guennebaud2014-04-25
| | | | Rename and generalize Kernel<*> to PacketBlock<*,N>.
* Enable fused madd for AltivecGravatar Gael Guennebaud2014-04-24
|
* Implement ptranspose on altivec and fix pgather/pscatterGravatar Gael Guennebaud2014-04-24
|
* Add Altivec implementation of pgather/pscatter (not tested)Gravatar Gael Guennebaud2014-04-23
|
* New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge ↵Gravatar Gael Guennebaud2014-04-16
| | | | | | speeup on Haswell. This changeset also introduce new vector functions: ploadquad and predux4.
* Add a mechanism to recursively access to half-size packet typesGravatar Gael Guennebaud2014-03-28
|
* Fix ploaddup and lin-spaced with AltiVec.Gravatar Gael Guennebaud2013-09-10
|
* Add missing pconj specializationsGravatar Gael Guennebaud2013-05-17
|
* Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.Gravatar Benoit Jacob2012-07-13
|
* Get rid of include directives inside namespace blocks (bug #339).Gravatar Jitse Niesen2012-04-15
|
* fix static inline versus inline static issues (the former is the correct order)Gravatar Gael Guennebaud2012-01-31
|
* better fix for gcc 4.6.0 / ptrdiff_t, as suggested by BenoitGravatar Thomas Capricelli2011-05-05
|
* Fix compilation with gcc-4.6.0, patch provided by Anton Gladky ↵Gravatar Thomas Capricelli2011-05-05
| | | | | | <gladky.anton@gmail.com>, working on debian packaging.
* fix AltiVec ploaddupGravatar Gael Guennebaud2011-02-24
|
* implement ploaddup for altivec and add respective unit testGravatar Gael Guennebaud2011-02-23
|
* Remove all references to EIGEN_TUNE_CPU_CACHE_SIZE.Gravatar Jitse Niesen2011-02-04
| | | | | This macro is no longer used as of revision 0212eec23f4cb64e8426bf32568156df302f8fcf .
* bug #86 : use internal:: namespace instead of ei_ prefixGravatar Benoit Jacob2010-10-25
|
* mixing types in product step 2:Gravatar Gael Guennebaud2010-07-11
| | | | | | | | * pload* and pset1 are now templated on the packet type * gemv routines are now embeded into a structure with a consistent API with respect to gemm * some configurations of vector * matrix and matrix * matrix works fine, some need more work...
* syncGravatar Gael Guennebaud2010-07-10
|\
| * forgot to commit ei_p4f_FORWARD;Gravatar Konstantinos Margaritis2010-07-09
| |
* | scalars fitting in a single packet requires more work, step 1Gravatar Gael Guennebaud2010-07-08
|/ | | | | * add a, Alignable trait * update LinearVectorization assignment
* s/IsVectorized/VectorizableGravatar Gael Guennebaud2010-07-07
|
* * add a IsVectorized mechanism (instead of packet-size>1...)Gravatar Gael Guennebaud2010-07-06
| | | | * vectorize complex<double>
* AltiVec signed integer pmadd removed, proved to be 2x slower than the scalar ↵Gravatar Konstantinos Margaritis2010-06-28
| | | | trait(!).
* Add a proof concept API to configure the blocking parameters at runtime.Gravatar Gael Guennebaud2010-06-07
| | | | After validation of the final API I'll update the other products to use it.
* (proper commit this time)Gravatar Konstantinos Margaritis2010-04-24
| | | | | | | replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function. Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h. Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch(). NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
* Backed out changeset 6972c140f737874d88da0e225c7c27b4563a4518Gravatar Konstantinos Margaritis2010-04-24
|
* replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() ↵Gravatar oem2010-04-24
| | | | | | | | inline function. Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h. Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch(). NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
* fix copy pasted commentGravatar Gael Guennebaud2010-03-05
|
* Altivec brought up to date. Most tests pass and performance is better than ↵Gravatar Konstantinos Margaritis2010-03-05
| | | | before too!
* Added initial NEON support, most tests pass however we had to use some ↵Gravatar Konstantinos Margaritis2010-03-03
| | | | | | | | hackish workarounds as gcc on ARM (both CodeSourcery 4.4.1 used and experimental 4.5) fail to ensure proper alignment with __attribute__((aligned(16))). This has to be fixed upstream to remove the workarounds.
* we were already aligning to 16 byte boundary fixed-size objects that are ↵Gravatar Benoit Jacob2009-10-05
| | | | | | | | multiple of 16 bytes; now we also align to 8byte boundary fixed-size objects that are multiple of 8 bytes. That's only useful for now for double, not e.g. for Vector2f, but that didn't seem to hurt. Am I missing something? Do you prefer that we don't align Vector2f at all? Also, improvements in test_unalignedassert.
* remove sentence "Eigen itself is part of the KDE project."Gravatar Benoit Jacob2009-05-22
| | | | it never made very precise sense. but now does it still make any?
* add SSE2 versions of sin, cos, log, exp using code from JulienGravatar Gael Guennebaud2009-03-25
| | | | | | | | Pommier. They are for float only, and they return exactly the same result as the standard versions in about 90% of the cases. Otherwise the max error is below 1e-7. However, for very large values (>1e3) the accuracy of sin and cos slighlty decrease. They are about 3 or 4 times faster than 4 calls to their respective standard versions. So, is it ok to enable them by default in their respective functors ?
* ei_pnegate implemented for AltiVecGravatar Konstantinos A. Margaritis2009-03-20
|
* add vectorization of unary operator-() (the AltiVec version is probablyGravatar Gael Guennebaud2009-03-20
| | | | broken)
* add the vectorization of absGravatar Gael Guennebaud2009-03-09
|
* no reason for 3 vec_mins, 2 are enough apparently in ei_predux_minGravatar Konstantinos A. Margaritis2009-02-12
|
* modified ei_predux_min/max to actually use altivec instructionsGravatar Konstantinos A. Margaritis2009-02-12
|
* * exit Sum.h, exit Prod.h, welcome vectorization of redux() !Gravatar Gael Guennebaud2009-02-12
| | | | * add vectorization for minCoeff and maxCoeff
* add ei_predux_mul for AltiVecGravatar Gael Guennebaud2009-02-10
|
* fixed preserve_mask definition for AltiVec (needed __vector keyword)Gravatar Konstantinos A. Margaritis2009-02-08
|
* add bench_reverse, draft of a reverse vectorization for AltiVec, makeGravatar Gael Guennebaud2009-02-06
| | | | global Scaling function static
* Missing inline keywords in AltiVec/PacketMath were making Avogadro failGravatar Benoit Jacob2008-08-27
| | | | to compile (duplicate symbols).
* remove double ;Gravatar Benoit Jacob2008-08-27
|