| Commit message (Collapse) | Author | Age |
|
|
|
| |
CJMADD, which were effectively unused, apart from on x86, where the change results in identically performing code.
|
|
|
|
| |
LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations.
|
| |
|
|
|
|
|
|
|
| |
There's a missing `EIGEN_HAS_CXX14` -> `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`
replacement.
Fixes ##2267
|
|
|
|
|
|
|
|
|
| |
We can't make guarantees on alignment for existing calls to `pset`,
so we should default to loading unaligned. But in that case, we should
just use `ploadu` directly. For loading constants, this load should hopefully
get optimized away.
This is causing segfaults in Google Maps.
|
|
|
|
| |
slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
MinGW spits out version strings like: `x86_64-w64-mingw32-g++ (GCC)
10-win32 20210110`, which causes the version extraction to fail.
Added support for this with tests.
Also added `make_unsigned` for `long long`, since mingw seems to
use that for `uint64_t`.
Related to #2268. CMake and build passes for me after this.
|
|
|
|
| |
optimization changing sign with --ffast-math enabled.
|
| |
|
| |
|
|
|
|
| |
Unified implementation using only `vzip`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This used to work for non-class types (e.g. raw function pointers) in
Eigen 3.3. This was changed in commit 11f55b29 to optimize the
evaluator:
> `sizeof((A-B).cwiseAbs2())` with A,B Vector4f is now 16 bytes, instead of 48 before this optimization.
though I cannot reproduce the 16 byte result. Both before the change
and after, with multiple compilers/versions, I always get a result of 40 bytes.
https://godbolt.org/z/MsjTc1PGe
This change modifies the code slightly to allow non-class types. The
final generated code is identical, and the expression remains 40 bytes
for the `abs2` sample case.
Fixes #2251
|
| |
|
|
|
|
|
|
|
|
|
| |
When calling conservativeResize() on a matrix with DontAlign flag, the
temporary variable used to perform the resize should have the same
Options as the original matrix to ensure that the correct override of
swap is called (i.e. PlainObjectBase::swap(DenseBase<OtherDerived> &
other). Calling the base class swap (i.e in DenseBase) results in
assertions errors or memory corruption.
|
| |
|
| |
|
|
|
|
| |
should fix #2242 .
|
|
|
|
| |
defined.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The boost library unfortunately specializes `conj` for various types and
assumes the original two-template-parameter version. This changes
restores the second parameter. This also restores ABI compatibility.
The specialization for `std::complex` is because `std::conj` is not
a device function. For custom complex scalar types, users should provide
their own `conj` implementation.
We may consider removing the unnecessary second parameter in the future - but
this will require modifying boost as well.
Fixes #2112.
|
|
|
|
|
|
|
|
| |
The cxx11 path for `numext::arg` incorrectly returned the complex type
instead of the real type, leading to compile errors. Fixed this and
added tests.
Related to !477, which uncovered the issue.
|
| |
|
|
|
|
|
|
| |
The original produced NaNs when dividing 0/b for subnormal b.
The `complex_divide_stable` was changed to use the more common
Smith's algorithm.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Fixes #2229.
For dynamic matrices with fixed-sized storage, only copy/swap
elements that have been set. Otherwise, this leads to inefficient
copying, and potential UB for non-initialized elements.
|
| |
|
|
|
|
| |
warnings).
|
| |
|
|
|
|
|
|
|
| |
Wrong shuffle was used. Need to interleave low/high halves with a
`permute` instruction.
Fixes #2215.
|
| |
|
|
|
|
| |
efficient `movsd` instruction for `pset1<Packet2cf>`.
|
|
|
|
|
|
| |
`std::result_of` and `std::invoke_result`.
Fixes #2209
|
| |
|
|
|
|
|
|
| |
Should have been 0.5 to widen the bounds, since this is inverse
precision. Setting to 0.5, however, leads to many more failing
tests at Google, so reverting to 1 for now.
|
|
|
|
| |
Also, remove unnecessary `pgather` operations.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adjust the relaxation step to use the condition
```
abs(subdiag[i]) <= epsilon * sqrt(abs(diag[i]) + abs(diag[i+1]))
```
for setting the subdiagonal entry to zero.
Also adjust Wilkinson shift for small `e = subdiag[end-1]` -
I couldn't find a reference for the original, and it was not
consistent with the Wilkinson definition.
Fixes #2191.
|
| |
|
|
|
|
| |
This fixes the problem with taking the address of temporary objects which clang11 treats as errors.
|
|
|
|
|
|
| |
user that the computation failed, possibly due to invalid input.
Make Jacobi and divide-and-conquer fail fast and return info() == InvalidInput if the matrix contains NaN or +/-Inf.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some CUDA/HIP constants fail on device with `constexpr` since they
internally rely on non-constexpr functions, e.g.
```
\#define CUDART_INF_F __int_as_float(0x7f800000)
```
This fails for cuda-clang (though passes with nvcc). These constants are
currently used by `device::numeric_limits`. For portability, we
need to remove `constexpr` from the affected functions.
For C++11 or higher, we should be able to rely on the `std::numeric_limits`
versions anyways, since the methods themselves are now `constexpr`, so
should be supported on device (clang/hipcc natively, nvcc with
`--expr-relaxed-constexpr`).
|
|
|
|
|
| |
Previously was `int`. Brought up by Kyle Snow (Polaris Geospatial
Services) on the mailing list.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Eigen unit-tests started failing on the HIP/ROCm platform, after the following commit
https://gitlab.com/libeigen/eigen/-/commit/e7b8643d70dfbb02ad94186169a8f16041f05bc2
```
In file included from /home/rocm-user/eigen/test/main.h:360:
In file included from /home/rocm-user/eigen/Eigen/QR:11:
In file included from /home/rocm-user/eigen/Eigen/Core:162:
/home/rocm-user/eigen/Eigen/src/Core/util/Meta.h:300:17: error: constexpr function never produces a constant expression [-Winvalid-constexpr]
static float (max)() {
^
/home/rocm-user/eigen/Eigen/src/Core/util/Meta.h:304:12: note: non-constexpr function '__int_as_float' cannot be used in a constant expression
return HIPRT_MAX_NORMAL_F;
^
/home/rocm-user/eigen/Eigen/src/Core/arch/HIP/hcc/math_constants.h:14:28: note: expanded from macro 'HIPRT_MAX_NORMAL_F'
#define HIPRT_MAX_NORMAL_F __int_as_float(0x7f7fffff)
^
/opt/rocm/hip/include/hip/hcc_detail/device_functions.h:913:32: note: declared here
__device__ static inline float __int_as_float(int x) {
^
```
The problem seems to that some of the constants defined in the HIP `math_constants.h` have a call to `__int_as_float` routine which is not declared `constexpr` in the HIP runtime header file.
Working around this issue for now, be skipping the const_expr support (enabled via the above commit) on HIP
|
| |
|
|
|
|
|
|
| |
innerStride(), outerStride(), and size()""
This reverts commit 5f0b4a4010af4cbf6161a0d1a03a747addc44a5d.
|
|
|
| |
This reverts commit f019b97aca82071f35726b1aaebf1c598770f0f5
|
| |
|