| Commit message (Collapse) | Author | Age |
|
|
|
|
| |
Rename EIGEN_EXCEPTIONS to EIGEN_USE_EXCEPTIONS, and allow disabling
exceptions with -DEIGEN_USE_EXCEPTIONS=0.
|
| |
|
| |
|
|
|
|
| |
division.
|
|
|
|
|
|
| |
- Move constructors can only be defaulted as NOEXCEPT if all members
have NOEXCEPT move constructors.
- gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter.
|
| |
|
|
|
|
| |
PPC) for TensorFlow
|
| |
|
| |
|
| |
|
|
|
|
| |
This commit addresses this.
|
|
|
|
| |
CJMADD, which were effectively unused, apart from on x86, where the change results in identically performing code.
|
|
|
|
| |
LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations.
|
| |
|
|
|
|
|
|
|
| |
There's a missing `EIGEN_HAS_CXX14` -> `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`
replacement.
Fixes ##2267
|
|
|
|
|
|
|
|
|
| |
We can't make guarantees on alignment for existing calls to `pset`,
so we should default to loading unaligned. But in that case, we should
just use `ploadu` directly. For loading constants, this load should hopefully
get optimized away.
This is causing segfaults in Google Maps.
|
|
|
|
| |
slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
MinGW spits out version strings like: `x86_64-w64-mingw32-g++ (GCC)
10-win32 20210110`, which causes the version extraction to fail.
Added support for this with tests.
Also added `make_unsigned` for `long long`, since mingw seems to
use that for `uint64_t`.
Related to #2268. CMake and build passes for me after this.
|
|
|
|
| |
optimization changing sign with --ffast-math enabled.
|
| |
|
| |
|
|
|
|
| |
Unified implementation using only `vzip`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This used to work for non-class types (e.g. raw function pointers) in
Eigen 3.3. This was changed in commit 11f55b29 to optimize the
evaluator:
> `sizeof((A-B).cwiseAbs2())` with A,B Vector4f is now 16 bytes, instead of 48 before this optimization.
though I cannot reproduce the 16 byte result. Both before the change
and after, with multiple compilers/versions, I always get a result of 40 bytes.
https://godbolt.org/z/MsjTc1PGe
This change modifies the code slightly to allow non-class types. The
final generated code is identical, and the expression remains 40 bytes
for the `abs2` sample case.
Fixes #2251
|
| |
|
|
|
|
|
|
|
|
|
| |
When calling conservativeResize() on a matrix with DontAlign flag, the
temporary variable used to perform the resize should have the same
Options as the original matrix to ensure that the correct override of
swap is called (i.e. PlainObjectBase::swap(DenseBase<OtherDerived> &
other). Calling the base class swap (i.e in DenseBase) results in
assertions errors or memory corruption.
|
| |
|
| |
|
|
|
|
| |
should fix #2242 .
|
|
|
|
| |
defined.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The boost library unfortunately specializes `conj` for various types and
assumes the original two-template-parameter version. This changes
restores the second parameter. This also restores ABI compatibility.
The specialization for `std::complex` is because `std::conj` is not
a device function. For custom complex scalar types, users should provide
their own `conj` implementation.
We may consider removing the unnecessary second parameter in the future - but
this will require modifying boost as well.
Fixes #2112.
|
|
|
|
|
|
|
|
| |
The cxx11 path for `numext::arg` incorrectly returned the complex type
instead of the real type, leading to compile errors. Fixed this and
added tests.
Related to !477, which uncovered the issue.
|
| |
|
|
|
|
|
|
| |
The original produced NaNs when dividing 0/b for subnormal b.
The `complex_divide_stable` was changed to use the more common
Smith's algorithm.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Fixes #2229.
For dynamic matrices with fixed-sized storage, only copy/swap
elements that have been set. Otherwise, this leads to inefficient
copying, and potential UB for non-initialized elements.
|
|
|
|
| |
warnings).
|
| |
|
|
|
|
|
|
|
| |
Wrong shuffle was used. Need to interleave low/high halves with a
`permute` instruction.
Fixes #2215.
|
| |
|
|
|
|
| |
efficient `movsd` instruction for `pset1<Packet2cf>`.
|
|
|
|
|
|
| |
`std::result_of` and `std::invoke_result`.
Fixes #2209
|
| |
|
| |
|
|
|
|
| |
This fixes the problem with taking the address of temporary objects which clang11 treats as errors.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some CUDA/HIP constants fail on device with `constexpr` since they
internally rely on non-constexpr functions, e.g.
```
\#define CUDART_INF_F __int_as_float(0x7f800000)
```
This fails for cuda-clang (though passes with nvcc). These constants are
currently used by `device::numeric_limits`. For portability, we
need to remove `constexpr` from the affected functions.
For C++11 or higher, we should be able to rely on the `std::numeric_limits`
versions anyways, since the methods themselves are now `constexpr`, so
should be supported on device (clang/hipcc natively, nvcc with
`--expr-relaxed-constexpr`).
|
|
|
|
|
| |
Previously was `int`. Brought up by Kyle Snow (Polaris Geospatial
Services) on the mailing list.
|
| |
|