aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
...
* Restore ABI compatibility for conj with 3.3, fix conflict with boost.Gravatar Antonio Sanchez2021-05-07
| | | | | | | | | | | | | | | The boost library unfortunately specializes `conj` for various types and assumes the original two-template-parameter version. This changes restores the second parameter. This also restores ABI compatibility. The specialization for `std::complex` is because `std::conj` is not a device function. For custom complex scalar types, users should provide their own `conj` implementation. We may consider removing the unnecessary second parameter in the future - but this will require modifying boost as well. Fixes #2112.
* Clean up gpu device properties.Gravatar Antonio Sanchez2021-05-07
| | | | | | | | Made a class and singleton to encapsulate initialization and retrieval of device properties. Related to !481, which already changed the API to address a static linkage issue.
* Fix numext::arg return type.Gravatar Antonio Sanchez2021-05-07
| | | | | | | | The cxx11 path for `numext::arg` incorrectly returned the complex type instead of the real type, leading to compile errors. Fixed this and added tests. Related to !477, which uncovered the issue.
* Revert addition of unused `paddsub<Packet2cf>`. This fixes #2242Gravatar Christoph Hertzberg2021-05-06
|
* Simplify TensorRandom and remove time-dependence.Gravatar Antonio Sanchez2021-05-04
| | | | | | | | | | | | | | | | | | | | Time-dependence prevents tests from being repeatable. This has long been an issue with debugging the tensor tests. Removing this will allow future tests to be repeatable in the usual way. Also, the recently added macros in !476 are causing headaches across different platforms. For example, checking `_XOPEN_SOURCE` is leading to multiple ambiguous macro errors across Google, and `_DEFAULT_SOURCE`/`_SVID_SOURCE`/`_BSD_SOURCE` are sometimes defined with values, sometimes defined as empty, and sometimes not defined at all when they probably should be. This is leading to multiple build breakages. The simplest approach is to generate a seed via `Eigen::internal::random<uint64_t>()` if on CPU. For GPU, we use a hash based on the current thread ID (since `rand()` isn't supported on GPU). Fixes #1602.
* Better CUDA complex division.Gravatar Antonio Sanchez2021-04-29
| | | | | | The original produced NaNs when dividing 0/b for subnormal b. The `complex_divide_stable` was changed to use the more common Smith's algorithm.
* Add missing pcmp_lt_or_nan for NEON Packet4bf.Gravatar Antonio Sanchez2021-04-27
|
* Added complex matrix unit tests for SelfAdjointEigenSolveGravatar Theo Fletcher2021-04-26
|
* Tests added and AVX512 bug fixed for pcmp_lt_or_nanGravatar Jakub Lichman2021-04-25
|
* Tests for pcmp_lt and pcmp_le addedGravatar Jakub Lichman2021-04-23
|
* Fix for issue with static global variables in TensorDeviceGpu.hGravatar Turing Eret2021-04-23
| | | | | | | | | | | | | | m_deviceProperties and m_devicePropInitialized are defined as global statics which will define multiple copies which can cause issues if initializeDeviceProp() is called in one translation unit and then m_deviceProperties is used in a different translation unit. Added inline functions getDeviceProperties() and getDevicePropInitialized() which defines those variables as static locals. As per the C++ standard 7.1.2/4, a static local declared in an inline function always refers to the same object, so this should be safer. Credit to Sun Chenggen for this fix. This fixes issue #1475.
* Check existence of BSD random before use.Gravatar Antonio Sanchez2021-04-22
| | | | | | | | | | | | | `TensorRandom` currently relies on BSD `random()`, which is not always available. The [linux manpage](https://man7.org/linux/man-pages/man3/srandom.3.html) gives the glibc condition: ``` _XOPEN_SOURCE >= 500 || /* Glibc since 2.19: */ _DEFAULT_SOURCE || /* Glibc <= 2.19: */ _SVID_SOURCE || _BSD_SOURCE ``` In particular, this was failing to compile for MinGW via msys2. If not available, we fall back to using `rand()`.
* DenseStorage safely copy/swap.Gravatar Antonio Sanchez2021-04-22
| | | | | | | | Fixes #2229. For dynamic matrices with fixed-sized storage, only copy/swap elements that have been set. Otherwise, this leads to inefficient copying, and potential UB for non-initialized elements.
* Make vectorized compute_inverse_size4 compile with AVX.Gravatar Rasmus Munk Larsen2021-04-22
|
* Compilation of basicbenchmark fixedGravatar Jakub Lichman2021-04-21
|
* Fix taking address of rvalue compiler issue with TensorFlow (plus other ↵Gravatar Chip-Kerchner2021-04-21
| | | | warnings).
* HasExp added for AVX512 Packet8dGravatar Jakub Lichman2021-04-20
|
* Fix ldexp for AVX512 (#2215)Gravatar Antonio Sanchez2021-04-20
| | | | | | | Wrong shuffle was used. Need to interleave low/high halves with a `permute` instruction. Fixes #2215.
* Before 3.4 branchGravatar David Tellenbach2021-04-18
|
* Modify googlehash use to account for namespace issues.Gravatar Antonio Sanchez2021-04-12
| | | | | | | | | | | | | | | The namespace declaration for googlehash is a configurable macro that can be disabled. In particular, it is disabled within google, causing compile errors since `dense_hash_map`/`sparse_hash_map` are then in the global namespace instead of in `::google`. Here we play a bit of gynastics to allow for both `google::*_hash_map` and `*_hash_map`, while limiting namespace polution. Symbols within the `::google` namespace are imported into `Eigen::google`. We also remove checks based on `_SPARSE_HASH_MAP_H_`, as this is fragile, and instead require `EIGEN_GOOGLEHASH_SUPPORT` to be defined.
* Avoid using uninitialized inputs and if available, use slightly more ↵Gravatar Christoph Hertzberg2021-04-13
| | | | efficient `movsd` instruction for `pset1<Packet2cf>`.
* Fix typo in TensorDimensions.hGravatar Rasmus Munk Larsen2021-04-12
|
* Fix for float16 GPU unit test.Gravatar Rohit Santhanam2021-04-12
|
* Use EIGEN_HAS_CXX11 and EIGEN_COMP_CXXVER macros to detect C++ version for ↵Gravatar Christoph Hertzberg2021-04-12
| | | | | | `std::result_of` and `std::invoke_result`. Fixes #2209
* fixed doxygen for unsupported iterative solver moduleGravatar Jens Wehner2021-04-11
|
* Make iterators default constructible and assignable, by making...Gravatar Christoph Hertzberg2021-04-09
|
* This fixes an issue where the compiler was not choosing the GPU specific ↵Gravatar Rohit Santhanam2021-04-08
| | | | | | | | | | | | | | specialization of ScanLauncher. The issue was discovered when the GPU scan unit test was run and resulted in a segmentation fault. The segmantation fault occurred because the unit test allocated GPU memory and passed a pointer to that memory to the computation that it presumed would execute on the GPU. But because of the issue, the computation was scheduled to execute on the CPU so a situation was constructed where the CPU attempted to access a GPU memory location. The fix expands the GPU specific ScanLauncher specialization to handle cases where vectorization is enabled. Previously, the GPU specialization is chosen only if Vectorization is not used.
* Scaled epsilon the wrong way.Gravatar Antonio Sanchez2021-04-07
| | | | | | Should have been 0.5 to widen the bounds, since this is inverse precision. Setting to 0.5, however, leads to many more failing tests at Google, so reverting to 1 for now.
* Replace `-2147483648` by `-0.0f` or `-0.0` constants (this should fix #2189).Gravatar Christoph Hertzberg2021-04-07
| | | | Also, remove unnecessary `pgather` operations.
* Align local arrays to Packet boundary.Gravatar Rasmus Munk Larsen2021-04-06
|
* Fix clang tidy warnings in AnnoyingScalar.Gravatar Antonio Sanchez2021-04-05
| | | | | | | | Clang-tidy complains that full specializations in headers can cause ODR violations. Marked these as `inline` to fix. It also complains about renaming arguments in specializations. Set the argument names to match.
* Fix SelfAdjoingEigenSolver (#2191)Gravatar Antonio Sanchez2021-04-05
| | | | | | | | | | | | | | Adjust the relaxation step to use the condition ``` abs(subdiag[i]) <= epsilon * sqrt(abs(diag[i]) + abs(diag[i+1])) ``` for setting the subdiagonal entry to zero. Also adjust Wilkinson shift for small `e = subdiag[end-1]` - I couldn't find a reference for the original, and it was not consistent with the Wilkinson definition. Fixes #2191.
* Fix two bugs in commitGravatar Rasmus Munk Larsen2021-04-02
|
* Fix address of temporary object errors in clang11.Gravatar Chip Kerchner2021-04-02
| | | | This fixes the problem with taking the address of temporary objects which clang11 treats as errors.
* Add CI infrastructure for pre-merge smoke tests.Gravatar David Tellenbach2021-04-01
| | | | | | This patch adds pre-merge smoke tests for x86 Linux using gcc-10 and clang-10. Closes #2188.
* Add CMake infrastructure for smoke testingGravatar David Tellenbach2021-03-31
| | | | | Necessary CMake changes to implement pre-merge smoke tests running via CI.
* Add an info() method to the SVDBase class to make it possible to tell the ↵Gravatar Rasmus Munk Larsen2021-03-31
| | | | | | user that the computation failed, possibly due to invalid input. Make Jacobi and divide-and-conquer fail fast and return info() == InvalidInput if the matrix contains NaN or +/-Inf.
* Add GitLab templates for issues and merge requestsGravatar Guoqiang QI2021-03-31
| | | | | | This patch adds GitLab templates for bug reports, feature and merge requests. This closes #2117.
* Fix CUDA constexpr issues for numeric_limits.Gravatar Antonio Sanchez2021-03-30
| | | | | | | | | | | | | | | | Some CUDA/HIP constants fail on device with `constexpr` since they internally rely on non-constexpr functions, e.g. ``` \#define CUDART_INF_F __int_as_float(0x7f800000) ``` This fails for cuda-clang (though passes with nvcc). These constants are currently used by `device::numeric_limits`. For portability, we need to remove `constexpr` from the affected functions. For C++11 or higher, we should be able to rely on the `std::numeric_limits` versions anyways, since the methods themselves are now `constexpr`, so should be supported on device (clang/hipcc natively, nvcc with `--expr-relaxed-constexpr`).
* Use Index type in loop over coefficients.Gravatar Antonio Sanchez2021-03-29
| | | | | Previously was `int`. Brought up by Kyle Snow (Polaris Geospatial Services) on the mailing list.
* Eliminate `round_impl` double-promotion warnings for c++03.Gravatar Antonio Sanchez2021-03-25
|
* Un-defining EIGEN_HAS_CONSTEXPR on the HIP platformGravatar Deven Desai2021-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The Eigen unit-tests started failing on the HIP/ROCm platform, after the following commit https://gitlab.com/libeigen/eigen/-/commit/e7b8643d70dfbb02ad94186169a8f16041f05bc2 ``` In file included from /home/rocm-user/eigen/test/main.h:360: In file included from /home/rocm-user/eigen/Eigen/QR:11: In file included from /home/rocm-user/eigen/Eigen/Core:162: /home/rocm-user/eigen/Eigen/src/Core/util/Meta.h:300:17: error: constexpr function never produces a constant expression [-Winvalid-constexpr] static float (max)() { ^ /home/rocm-user/eigen/Eigen/src/Core/util/Meta.h:304:12: note: non-constexpr function '__int_as_float' cannot be used in a constant expression return HIPRT_MAX_NORMAL_F; ^ /home/rocm-user/eigen/Eigen/src/Core/arch/HIP/hcc/math_constants.h:14:28: note: expanded from macro 'HIPRT_MAX_NORMAL_F' #define HIPRT_MAX_NORMAL_F __int_as_float(0x7f7fffff) ^ /opt/rocm/hip/include/hip/hcc_detail/device_functions.h:913:32: note: declared here __device__ static inline float __int_as_float(int x) { ^ ``` The problem seems to that some of the constants defined in the HIP `math_constants.h` have a call to `__int_as_float` routine which is not declared `constexpr` in the HIP runtime header file. Working around this issue for now, be skipping the const_expr support (enabled via the above commit) on HIP
* Fixed performance issues for complex VSX and P10 MMA in gebp_kernel (level 3).Gravatar Chip Kerchner2021-03-25
|
* Revert "Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), ↵Gravatar Steve Bronder2021-03-24
| | | | | | innerStride(), outerStride(), and size()"" This reverts commit 5f0b4a4010af4cbf6161a0d1a03a747addc44a5d.
* Eliminate mixingtypes_7 warning.Gravatar Antonio Sanchez2021-03-24
| | | | | | `g_called` is not used in subtest 7, so was generating a `-Wunneeded-internal-declaration` warnings. Here we silence it by initializing the static variable.
* Revert "Uses _mm512_abs_pd for Packet8d pabs"Gravatar Christoph Hertzberg2021-03-23
| | | This reverts commit f019b97aca82071f35726b1aaebf1c598770f0f5
* Re-enable CI for PowerGravatar David Tellenbach2021-03-22
|
* Remove yet another comma at end of enumGravatar David Tellenbach2021-03-18
|
* Uses _mm512_abs_pd for Packet8d pabsGravatar Steve Bronder2021-03-18
|
* Split test commainitializer into two substestsGravatar David Tellenbach2021-03-18
|