Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Remove now-unused protate PacketMath func | 2016-05-24 | |
| | |||
* | Fixed the packet_traits for half floats. | 2016-04-08 | |
| | |||
* | Added polygamma function. | 2016-04-01 | |
| | |||
* | Added zeta function. | 2016-04-01 | |
| | |||
* | Resolve bad merge. | 2016-03-08 | |
| | |||
* | Added support for vectorized type casting of int to char. | 2016-02-03 | |
| | |||
* | Fixed compilation warning | 2016-01-28 | |
| | |||
* | Add digamma for CPU + CUDA. Includes tests. | 2015-12-24 | |
| | |||
* | Cleanup | 2015-12-08 | |
| | |||
* | Fixed a couple of typos | 2015-12-07 | |
| | | | | Cleaned up the code a bit. | ||
* | Add special functions to Eigen: lgamma, erf, erfc. | 2015-12-07 | |
| | | | | Includes CUDA support and unit tests. | ||
* | added scalar_sign_op (both real,complex) | 2015-11-24 | |
| | |||
* | Fix prototype of plset and generalize linspace functor. | 2015-08-07 | |
| | |||
* | Let unpacket_traits<> exposes the required alignment and make use of it ↵ | 2015-08-07 | |
| | | | | everywhere | ||
* | First part of a big refactoring of alignment control to enable the handling ↵ | 2015-08-06 | |
| | | | | | | | | | of arbitrarily aligned buffers. It includes: - AlignedBit flag is deprecated. Alignment is now specified by the evaluator through the 'Alignment' enum, e.g., evaluator<Xpr>::Alignment. Its value is in Bytes. - Add several enums to specify alignment: Aligned8, Aligned16, Aligned32, Aligned64, Aligned128. AlignedMax corresponds to EIGEN_MAX_ALIGN_BYTES. Such enums are used to define the above Alignment value, and as the 'Options' template parameter of Map<> and Ref<>. - The Aligned enum is now deprecated. It is now an alias for Aligned16. - Currently, traits<Matrix<>>, traits<Array<>>, traits<Ref<>>, traits<Map<>>, and traits<Block<>> also expose the Alignment enum. | ||
* | Added support for prefetching on cuda devices | 2015-07-17 | |
| | |||
* | bug #80: merge with d_hood branch on adding more coefficient-wise unary ↵ | 2015-06-10 | |
|\ | | | | | | | array functors | ||
| * | Remove packet isNaN, isInf, isFinite | 2015-03-17 | |
| | | |||
| * | Rename isinf to isInf | 2015-03-17 | |
| | | |||
| * | Add isfinite array support as isFinite | 2015-03-17 | |
| | | |||
| * | Rename isnan to isNaN | 2015-03-17 | |
| | | |||
| * | Add hyperbolic trigonometric functions from std array support | 2015-03-11 | |
| | | |||
| * | Add log10 array support | 2015-03-11 | |
| | | |||
| * | Additional unary coeff-wise functors (isnan, round, arg, e.g.) | 2015-03-11 | |
| | | |||
* | | Improved the default implementation of prsqrt | 2015-02-28 | |
| | | |||
* | | Pulled latest updates from trunk | 2015-02-27 | |
|\| | |||
* | | Added support for vectorized type casting of tensors | 2015-02-27 | |
| | | |||
* | | Added support for fast reciprocal square root computation. | 2015-02-26 | |
| | | |||
| * | Reimplement the selection between rotating and non-rotating kernels | 2015-02-27 | |
| | | | | | | | | | | | | using templates instead of macros and if()'s. That was needed to fix the build of unit tests on ARM, which I had broken. My bad for not testing earlier. | ||
| * | Replace a static assert by a runtime one, fixes the build of unit tests on ARM | 2015-02-27 | |
|/ | | | | | Also safely assert in the non-implemented path that should never be taken in practice, and would return wrong results. | ||
* | bug #955 - Implement a rotating kernel alternative in the 3px4 gebp path | 2015-02-18 | |
| | | | | | | | | This is substantially faster on ARM, where it's important to minimize the number of loads. This is specific to the case where all packet types are of size 4. I made my best attempt to minimize how dirty this is... opinions welcome. Eventually one could have a generic rotated kernel, but it would take some work to get there. Also, on sandy bridge, in my experience, it's not beneficial (even about 1% slower). | ||
* | The usage of DenseIndex is deprecated, so let's replace DenseIndex by Index | 2015-02-16 | |
| | |||
* | Pulled the latest changes from the trunk | 2015-02-06 | |
|\ | |||
| * | Introduce unified macros to identify compiler, OS, and architecture. They ↵ | 2014-11-04 | |
| | | | | | | | | are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively. | ||
| * | bug #701: workaround (min) and (max) blocking ADL by introducing ↵ | 2014-10-20 | |
| | | | | | | | | numext::mini and numext::maxi internal functions and a EIGEN_NOT_A_MACRO macro. | ||
* | | Misc improvements and cleanups | 2014-10-13 | |
| | | |||
* | | More tests to validate the const-correctness of the tensor code. | 2014-10-02 | |
| | | |||
* | | Pulled in the latest changes from the Eigen trunk | 2014-08-13 | |
|\| | |||
| * | Fix many long to int implicit conversions | 2014-07-08 | |
| | | |||
| * | chmod -x Eigen/src/Core/GenericPacketMath.h | 2014-07-07 | |
| | | |||
| * | Add component-wise atan() function (see bug #80). | 2014-06-19 | |
| | | |||
* | | Created the pblend packet primitive and implemented it using SSE and AVX ↵ | 2014-06-06 | |
|/ | | | | instructions. | ||
* | Enable vectorization of pack_rhs with a column-major RHS. | 2014-04-25 | |
| | | | | Rename and generalize Kernel<*> to PacketBlock<*,N>. | ||
* | New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge ↵ | 2014-04-16 | |
| | | | | | | speeup on Haswell. This changeset also introduce new vector functions: ploadquad and predux4. | ||
* | Add a mechanism to recursively access to half-size packet types | 2014-03-28 | |
| | |||
* | Implemented the SSE version of the gather and scatter packet primitives. | 2014-03-27 | |
| | |||
* | Introduced pscatter/pgather packet primitives. They will be used to optimize ↵ | 2014-03-27 | |
| | | | | the loop peeling code of the block-panel matrix multiplication kernel. | ||
* | Created the ptranspose packet primitive that can transpose an array of N ↵ | 2014-03-26 | |
| | | | | | | packets, where N is the number of words in each packet. This primitive will be used to complete the vectorization of the gemm_pack_lhs and gemm_pack_rhs functions. Implemented the primitive using SSE instructions. | ||
* | add pbroadcast2/4 generic intrinsics | 2014-03-26 | |
| | |||
* | Revert previous change and introduce a new workaround regarding gcc ↵ | 2014-03-20 | |
| | | | | | | | generating a shufps instruction instead of the more efficient pshufd instruction. The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply. Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering. |