Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X ↵ | Chip-Kerchner | 2021-06-16 |
| | | | | slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate. | ||
* | Fix taking address of rvalue compiler issue with TensorFlow (plus other ↵ | Chip-Kerchner | 2021-04-21 |
| | | | | warnings). | ||
* | Fixed performance issues for complex VSX and P10 MMA in gebp_kernel (level 3). | Chip Kerchner | 2021-03-25 |
| | |||
* | Make half/bfloat16 constructor take inputs by value, fix powerpc test. | Antonio Sanchez | 2021-02-27 |
| | | | | | | | | | | | | Since `numeric_limits<half>::max_exponent` is a static inline constant, it cannot be directly passed by reference. This triggers a linker error in recent versions of `g++-powerpc64le`. Changing `half` to take inputs by value fixes this. Wrapping `max_exponent` with `int(...)` to make an addressable integer also fixes this and may help with other custom `Scalar` types down-the-road. Also eliminated some compile warnings for powerpc. | ||
* | Fix clang compile when no MMA flags are set. Simplify MMA compiler detection. | Chip-Kerchner | 2021-02-24 |
| | |||
* | Having forward template function declarations in a P10 file causes bad code ↵ | Chip-Kerchner | 2021-02-24 |
| | | | | in certain situations. | ||
* | Fixes to support old and new versions of the compilers for built-ins. Cast ↵ | Chip-Kerchner | 2021-02-24 |
| | | | | to non-const when using vector_pair with certain built-ins. | ||
* | Fix compilation errors with later versions of GCC and use of MMA. | Chip-Kerchner | 2021-02-22 |
| | |||
* | Fixed performance issues for VSX and P10 MMA in general_matrix_matrix_product | Chip Kerchner | 2021-02-17 |
| | |||
* | Add support for dynamic dispatch of MMA instructions for POWER 10 | Pedro Caldeira | 2020-11-12 |