aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/arch/AltiVec/MatrixProductMMA.h
Commit message (Collapse)AuthorAge
* EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X ↵Gravatar Chip-Kerchner2021-06-16
| | | | slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate.
* Fix taking address of rvalue compiler issue with TensorFlow (plus other ↵Gravatar Chip-Kerchner2021-04-21
| | | | warnings).
* Fixed performance issues for complex VSX and P10 MMA in gebp_kernel (level 3).Gravatar Chip Kerchner2021-03-25
|
* Make half/bfloat16 constructor take inputs by value, fix powerpc test.Gravatar Antonio Sanchez2021-02-27
| | | | | | | | | | | | Since `numeric_limits<half>::max_exponent` is a static inline constant, it cannot be directly passed by reference. This triggers a linker error in recent versions of `g++-powerpc64le`. Changing `half` to take inputs by value fixes this. Wrapping `max_exponent` with `int(...)` to make an addressable integer also fixes this and may help with other custom `Scalar` types down-the-road. Also eliminated some compile warnings for powerpc.
* Fix clang compile when no MMA flags are set. Simplify MMA compiler detection.Gravatar Chip-Kerchner2021-02-24
|
* Having forward template function declarations in a P10 file causes bad code ↵Gravatar Chip-Kerchner2021-02-24
| | | | in certain situations.
* Fixes to support old and new versions of the compilers for built-ins. Cast ↵Gravatar Chip-Kerchner2021-02-24
| | | | to non-const when using vector_pair with certain built-ins.
* Fix compilation errors with later versions of GCC and use of MMA.Gravatar Chip-Kerchner2021-02-22
|
* Fixed performance issues for VSX and P10 MMA in general_matrix_matrix_productGravatar Chip Kerchner2021-02-17
|
* Add support for dynamic dispatch of MMA instructions for POWER 10Gravatar Pedro Caldeira2020-11-12