aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/GenericPacketMath.h
Commit message (Collapse)AuthorAge
* Remove now-unused protate PacketMath funcGravatar Benoit Jacob2016-05-24
|
* Fixed the packet_traits for half floats.Gravatar Benoit Steiner2016-04-08
|
* Added polygamma function.Gravatar Till Hoffmann2016-04-01
|
* Added zeta function.Gravatar Till Hoffmann2016-04-01
|
* Resolve bad merge.Gravatar Eugene Brevdo2016-03-08
|
* Added support for vectorized type casting of int to char.Gravatar Benoit Steiner2016-02-03
|
* Fixed compilation warningGravatar Benoit Steiner2016-01-28
|
* Add digamma for CPU + CUDA. Includes tests.Gravatar Eugene Brevdo2015-12-24
|
* CleanupGravatar Benoit Steiner2015-12-08
|
* Fixed a couple of typosGravatar Benoit Steiner2015-12-07
| | | | Cleaned up the code a bit.
* Add special functions to Eigen: lgamma, erf, erfc.Gravatar Eugene Brevdo2015-12-07
| | | | Includes CUDA support and unit tests.
* added scalar_sign_op (both real,complex)Gravatar Mark Borgerding2015-11-24
|
* Fix prototype of plset and generalize linspace functor.Gravatar Gael Guennebaud2015-08-07
|
* Let unpacket_traits<> exposes the required alignment and make use of it ↵Gravatar Gael Guennebaud2015-08-07
| | | | everywhere
* First part of a big refactoring of alignment control to enable the handling ↵Gravatar Gael Guennebaud2015-08-06
| | | | | | | | | of arbitrarily aligned buffers. It includes: - AlignedBit flag is deprecated. Alignment is now specified by the evaluator through the 'Alignment' enum, e.g., evaluator<Xpr>::Alignment. Its value is in Bytes. - Add several enums to specify alignment: Aligned8, Aligned16, Aligned32, Aligned64, Aligned128. AlignedMax corresponds to EIGEN_MAX_ALIGN_BYTES. Such enums are used to define the above Alignment value, and as the 'Options' template parameter of Map<> and Ref<>. - The Aligned enum is now deprecated. It is now an alias for Aligned16. - Currently, traits<Matrix<>>, traits<Array<>>, traits<Ref<>>, traits<Map<>>, and traits<Block<>> also expose the Alignment enum.
* Added support for prefetching on cuda devicesGravatar Benoit Steiner2015-07-17
|
* bug #80: merge with d_hood branch on adding more coefficient-wise unary ↵Gravatar Gael Guennebaud2015-06-10
|\ | | | | | | array functors
| * Remove packet isNaN, isInf, isFiniteGravatar Deanna Hood2015-03-17
| |
| * Rename isinf to isInfGravatar Deanna Hood2015-03-17
| |
| * Add isfinite array support as isFiniteGravatar Deanna Hood2015-03-17
| |
| * Rename isnan to isNaNGravatar Deanna Hood2015-03-17
| |
| * Add hyperbolic trigonometric functions from std array supportGravatar Deanna Hood2015-03-11
| |
| * Add log10 array supportGravatar Deanna Hood2015-03-11
| |
| * Additional unary coeff-wise functors (isnan, round, arg, e.g.)Gravatar Deanna Hood2015-03-11
| |
* | Improved the default implementation of prsqrtGravatar Benoit Steiner2015-02-28
| |
* | Pulled latest updates from trunkGravatar Benoit Steiner2015-02-27
|\|
* | Added support for vectorized type casting of tensorsGravatar Benoit Steiner2015-02-27
| |
* | Added support for fast reciprocal square root computation.Gravatar Benoit Steiner2015-02-26
| |
| * Reimplement the selection between rotating and non-rotating kernelsGravatar Benoit Jacob2015-02-27
| | | | | | | | | | | | using templates instead of macros and if()'s. That was needed to fix the build of unit tests on ARM, which I had broken. My bad for not testing earlier.
| * Replace a static assert by a runtime one, fixes the build of unit tests on ARMGravatar Benoit Jacob2015-02-27
|/ | | | | Also safely assert in the non-implemented path that should never be taken in practice, and would return wrong results.
* bug #955 - Implement a rotating kernel alternative in the 3px4 gebp pathGravatar Benoit Jacob2015-02-18
| | | | | | | | This is substantially faster on ARM, where it's important to minimize the number of loads. This is specific to the case where all packet types are of size 4. I made my best attempt to minimize how dirty this is... opinions welcome. Eventually one could have a generic rotated kernel, but it would take some work to get there. Also, on sandy bridge, in my experience, it's not beneficial (even about 1% slower).
* The usage of DenseIndex is deprecated, so let's replace DenseIndex by IndexGravatar Gael Guennebaud2015-02-16
|
* Pulled the latest changes from the trunkGravatar Benoit Steiner2015-02-06
|\
| * Introduce unified macros to identify compiler, OS, and architecture. They ↵Gravatar Gael Guennebaud2014-11-04
| | | | | | | | are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively.
| * bug #701: workaround (min) and (max) blocking ADL by introducing ↵Gravatar Gael Guennebaud2014-10-20
| | | | | | | | numext::mini and numext::maxi internal functions and a EIGEN_NOT_A_MACRO macro.
* | Misc improvements and cleanupsGravatar Benoit Steiner2014-10-13
| |
* | More tests to validate the const-correctness of the tensor code.Gravatar Benoit Steiner2014-10-02
| |
* | Pulled in the latest changes from the Eigen trunkGravatar Benoit Steiner2014-08-13
|\|
| * Fix many long to int implicit conversionsGravatar Gael Guennebaud2014-07-08
| |
| * chmod -x Eigen/src/Core/GenericPacketMath.hGravatar Chen-Pang He2014-07-07
| |
| * Add component-wise atan() function (see bug #80).Gravatar Roger Martin2014-06-19
| |
* | Created the pblend packet primitive and implemented it using SSE and AVX ↵Gravatar Benoit Steiner2014-06-06
|/ | | | instructions.
* Enable vectorization of pack_rhs with a column-major RHS.Gravatar Gael Guennebaud2014-04-25
| | | | Rename and generalize Kernel<*> to PacketBlock<*,N>.
* New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge ↵Gravatar Gael Guennebaud2014-04-16
| | | | | | speeup on Haswell. This changeset also introduce new vector functions: ploadquad and predux4.
* Add a mechanism to recursively access to half-size packet typesGravatar Gael Guennebaud2014-03-28
|
* Implemented the SSE version of the gather and scatter packet primitives.Gravatar Benoit Steiner2014-03-27
|
* Introduced pscatter/pgather packet primitives. They will be used to optimize ↵Gravatar Benoit Steiner2014-03-27
| | | | the loop peeling code of the block-panel matrix multiplication kernel.
* Created the ptranspose packet primitive that can transpose an array of N ↵Gravatar Benoit Steiner2014-03-26
| | | | | | packets, where N is the number of words in each packet. This primitive will be used to complete the vectorization of the gemm_pack_lhs and gemm_pack_rhs functions. Implemented the primitive using SSE instructions.
* add pbroadcast2/4 generic intrinsicsGravatar Gael Guennebaud2014-03-26
|
* Revert previous change and introduce a new workaround regarding gcc ↵Gravatar Gael Guennebaud2014-03-20
| | | | | | | generating a shufps instruction instead of the more efficient pshufd instruction. The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply. Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering.