| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for Arm's new vector extension SVE (Scalable Vector Extension). In contrast to other vector extensions that are supported by Eigen, SVE types are inherently *sizeless*. For the use in Eigen we fix their size at compile-time (note that this is not necessary in general, SVE is *length agnostic*).
During compilation the flag `-msve-vector-bits=N` has to be set where `N` is a power of two in the range of `128`to `2048`, indicating the length of an SVE vector.
Since SVE is rather young, we decided to disable it by default even if it would be available. A user has to enable it explicitly by defining `EIGEN_ARM64_USE_SVE`.
This patch introduces the packet types `PacketXf` and `PacketXi` for packets of `float` and `int32_t` respectively. The size of these packets depends on the SVE vector length. E.g. if `-msve-vector-bits=512` is set, `PacketXf` will contain `512/32 = 16` elements.
This MR is joint work with Miguel Tairum <miguel.tairum@arm.com>.
|
|
|
|
|
|
| |
across platforms.
Change test to only test for NaN-propagation for pfmin/pfmax.
|
| |
|
|\ |
|
| | |
|
|/ |
|
|
|
|
| |
Had to introduce a UndefinedIncr constant for non structured list of indices.
|
| |
|
| |
|
| |
|
|
|
|
| |
remove the remaining usages of EvalBeforeNestingBit.
|
| |
|
| |
|
| |
|
|
|
|
| |
functors. This makes them more cuda friendly.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- Dynamic is now an invalid value
- introduce a HugeCost constant to be used for runtime-cost values or arbitrarily huge cost
- add sanity checks for cost values: must be >=0 and not too large
This change provides several benefits:
- it fixes shortcoming is some cost computation where the Dynamic case was not properly handled.
- it simplifies cost computation logic, and should avoid future similar shortcomings.
- it allows to distinguish between different level of dynamic/huge/infinite cost
- it should enable further simplifications in the computation of costs (save compilation time)
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
of arbitrarily aligned buffers. It includes:
- AlignedBit flag is deprecated. Alignment is now specified by the evaluator through the 'Alignment' enum, e.g., evaluator<Xpr>::Alignment. Its value is in Bytes.
- Add several enums to specify alignment: Aligned8, Aligned16, Aligned32, Aligned64, Aligned128. AlignedMax corresponds to EIGEN_MAX_ALIGN_BYTES. Such enums are used to define the above Alignment value, and as the 'Options' template parameter of Map<> and Ref<>.
- The Aligned enum is now deprecated. It is now an alias for Aligned16.
- Currently, traits<Matrix<>>, traits<Array<>>, traits<Ref<>>, traits<Map<>>, and traits<Block<>> also expose the Alignment enum.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
functors for comparison operators
|
|
|
|
| |
encoding it in the options.
|
|
|
|
| |
enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary.
|
|
|
|
|
|
| |
data()/*Stride() for dense matrices),
and a CompressedAccessBit flag (similar to DirectAccessBit for dense matrices).
|
| |
|
| |
|
|\ |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| | |
glu_shape<S1,S2> helper to assemble sparse/dense shapes with triagular/seladjoint views.
|
| |
| |
| |
| | |
coefficient-wise operations.
|
| |
| |
| |
| |
| |
| | |
storage order.
It is currently only used in Product.
|
| | |
|
| | |
|
| | |
|
| | |
|
|/ |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
between Diagonal<S,-1> (the first sub diagonal) and a runtime super/sub diagonal which is now:
Diagonal<S,DynamicIndex>
|
|
|
|
|
|
|
|
| |
- Add evaluator_traits with HasEvalTo flag, which is true if evaluator
has evalTo() function.
- Add AllAtOnce traversal, which calls evalTo() in evaluator.
- If source evaluator in copy_using_evaluator has HasEvalTo set, then
use AllAtOnce traversal.
|
|
|
|
| |
This gets rid of some warnings on Intel Composer XE, apparently.
|
| |
|
|
|
|
|
|
|
| |
- Add support for Upper or Lower inputs.
- Add supports for sparse RHS
- Remove transposed cases, remove ordering method interface
- Add full access to PARDISO parameters
|