aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
...
* Fix missing `pfirst<Packet16b>` for MSVC.Gravatar Antonio Sanchez2020-10-16
| | | | | It was only defined under one `#ifdef` case. This fixes the `packetmath_14` test for MSVC.
* Fix the specialization of pfrexp for AVX to be faster when AVX2/AVX512DQ is ↵Gravatar Rasmus Munk Larsen2020-10-15
| | | | not available, and avoid undefined behavior in C++. Also mask off the sign bit when extracting the exponent.
* Fix for ROCm/HIP breakage - 201013Gravatar Deven Desai2020-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The following commit seems to have introduced regressions in ROCm/HIP support. https://gitlab.com/libeigen/eigen/-/commit/183a208212353ccf81a664d25dc7660b6269acdd It causes some unit-tests to fail with the following error ``` ... Eigen/src/Core/GenericPacketMath.h:322:3: error: no member named 'bit_and' in the global namespace; did you mean 'std::bit_and'? ... Eigen/src/Core/GenericPacketMath.h:329:3: error: no member named 'bit_or' in the global namespace; did you mean 'std::bit_or'? ... Eigen/src/Core/GenericPacketMath.h:336:3: error: no member named 'bit_xor' in the global namespace; did you mean 'std::bit_xor'? ... ``` The error occurs because, when compiling the device code in HIP/CUDA, the compiler will pick up the some of the std functions (whose calls are prefixed by EIGEN_USING_STD) from the global namespace (i.e. use ::bit_xor instead of std::bit_xor). For this to work, those functions must be declared in the global namespace in the HIP/CUDA header files. The `bit_and`, `bit_or` and `bit_xor` routines are not declared in the HIP header file that contain the decls for the std math functions ( `math_functions.h` ), and this is the cause of the error above. It seems that the newer HIP compilers do support the calling of `std::` math routines within device code, and the ideal fix here would have been to change all calls to std math functions in EIGEN to use the `std::` namespace (instead of the global namespace ), when compiling with HIP compiler. However it seems there was a recent commit to remove the EIGEN_USING_STD_MATH macro and collapse it uses into the EIGEN_USING_STD macro ( https://gitlab.com/libeigen/eigen/-/commit/4091f6b25c5ad0ca3f7c00bd82bfd7ca1bbedee3 ). Replacing all std math calls will essentially require re-surrecting the EIGEN_USING_STD_MATH macro, so not choosing that option. Also HIP compilers only have support std math calls within device code, and not all std functions (specifically not for malloc/free which are prefixed via EIGEN_USING_STD). So modyfing EIGEN_USE_STD implementation to use std:: namspace for HIP will not work either. Hence going for the ugly solution of special casing the three calls that breaking the HIP compile, to explicitly use the std:: namespace
* Revert change from 4e4d3f32d168ed9ce09d950f099a60ddcd11240f that broke ↵Gravatar Rasmus Munk Larsen2020-10-15
| | | | BFloat16.h build with older compilers.
* Add AVX plog<Packet4d> and AVX512 plog<Packet8d> ops,also unified AVX512 ↵Gravatar Guoqiang QI2020-10-15
| | | | plog<Packet16f> op with generic api
* Add specializations for pmin/pmax with prescribed NaN propagation semantics ↵Gravatar Rasmus Munk Larsen2020-10-14
| | | | for SSE/AVX/AVX512.
* Remove leftover debug print statement in cxx11_tensor_expr.cppGravatar Rasmus Munk Larsen2020-10-14
|
* Revert generic implementation of `predux`, since it break compilation of ↵Gravatar Rasmus Munk Larsen2020-10-14
| | | | `predux_any` with MSVC.
* Add MatrixBase::cwiseArg()Gravatar David Tellenbach2020-10-14
|
* Get rid of nested template specialization in TensorReductionGpu.h, which was ↵Gravatar Rasmus Munk Larsen2020-10-13
| | | | broken by c6953f799b01d36f4236b64f351cc1446e0abe17.
* Add packet generic ops `predux_fmin`, `predux_fmin_nan`, `predux_fmax`, and ↵Gravatar Rasmus Munk Larsen2020-10-13
| | | | `predux_fmax_nan` that implement reductions with `PropagateNaN`, and `PropagateNumbers` semantics. Add (slow) generic implementations for most reductions.
* undefine EIGEN_CONSTEXPR before redefinitionGravatar acxz2020-10-12
|
* Make bitwise_helper a device function to unbreak GPU builds.Gravatar Rasmus Munk Larsen2020-10-10
|
* Clean up packetmath tests and fix various bugs to make bfloat16 pass ↵Gravatar Rasmus Munk Larsen2020-10-09
| | | | (almost) all packetmath tests with SSE, AVX, and AVX512.
* Disable test exceptions when using OpenMP.Gravatar David Tellenbach2020-10-09
|
* Mention problems when using potentially throwing scalars and OpenMPGravatar David Tellenbach2020-10-09
|
* Fix typo in Tutorial_BlockOperations_block_assignment.cppGravatar Karl Ljungkvist2020-10-09
|
* Drop EIGEN_USING_STD_MATH in favour of EIGEN_USING_STDGravatar David Tellenbach2020-10-09
|
* Implement generic bitwise logical packet ops that work for all types.Gravatar Rasmus Munk Larsen2020-10-08
|
* Add EIGEN prefix for HAS_LGAMMA_RGravatar David Tellenbach2020-10-08
|
* Use lgamma_r if it is available (update check for glibc 2.19+)Gravatar Eugene Zhulenev2020-10-08
|
* Don't make assumptions about NaN-propagation for pmin/pmax - it various ↵Gravatar Rasmus Munk Larsen2020-10-07
| | | | | | across platforms. Change test to only test for NaN-propagation for pfmin/pfmax.
* Use reinterpret_cast instead of C-style cast in Inverse_NEON.hGravatar David Tellenbach2020-10-04
|
* Don't cast away const in Inverse_NEON.h.Gravatar Rasmus Munk Larsen2020-10-02
|
* Use EIGEN_USING_STD to fix CUDA compilation error on BFloat16.h.Gravatar Rasmus Munk Larsen2020-10-02
|
* Fix CUDA build breakage and incorrect result for absdiff on HIP with long ↵Gravatar Rasmus Munk Larsen2020-10-02
| | | | double arguments.
* dont use =* might not return a ScalarGravatar janos2020-10-02
|
* Fix build breakage with MSVC 2019, which does not support MMX intrinsics for ↵Gravatar Rasmus Munk Larsen2020-10-01
| | | | | | | | 64 bit builds, see: https://stackoverflow.com/questions/60933486/mmx-intrinsics-like-mm-cvtpd-pi32-not-found-with-msvc-2019-for-64bit-targets-c Instead use the equivalent SSE2 intrinsics.
* Add a generic packet ops corresponding to {std}::fmin and {std}::fmax. The ↵Gravatar Rasmus Munk Larsen2020-10-01
| | | | non-sensical NaN-propagation rules for std::min std::max implemented by pmin and pmax in Eigen is a longstanding source og confusion and bug report. This change is a first step towards addressing it, as discussing in issue #564.
* Specialize pldexp_double and pfdexp_double and get rid of Packet2l ↵Gravatar Rasmus Munk Larsen2020-09-30
| | | | definition for SSE. SSE does not support conversion between 64 bit integers and double and the existing implementation of casting between Packet2d and Packer2l results in undefined behavior when casting NaN to int. Since pldexp and pfdexp only manipulate exponent fields that fit in 32 bit, this change provides specializations that use existing instructions _mm_cvtpd_pi32 and _mm_cvtsi32_pd instead.
* Fix alignedbox 32-bit precision test failure.Gravatar Antonio Sanchez2020-09-30
| | | | | | | | | | | | | | | The current `test/geo_alignedbox` tests fail on 32-bit arm due to small floating-point errors. In particular, the following is not guaranteed to hold: ``` IsometryTransform identity = IsometryTransform::Identity(); BoxType transformedC; transformedC.extend(c.transformed(identity)); VERIFY(transformedC.contains(c)); ``` since `c.transformed(identity)` is ever-so-slightly different from `c`. Instead, we replace this test with one that checks an identity transform is within floating-point precision of `c`. Also updated the condition on `AlignedBox::transform(...)` to only accept `Affine`, `AffineCompact`, and `Isometry` modes explicitly. Otherwise, invalid combinations of modes would also incorrectly pass the assertion.
* Fix failure in GEBP kernel when compiling with OpenMP and FMAGravatar David Tellenbach2020-09-30
| | | | Fixes #1995
* Revert !182.Gravatar Rasmus Munk Larsen2020-09-29
|
* Add missing newline at the end of Inverse_NEON.hGravatar Rasmus Munk Larsen2020-09-29
|
* Fix compilation of 64 bit constant arguments to pset1frombits in ↵Gravatar Rasmus Munk Larsen2020-09-28
| | | | TypeCasting.h on platforms where uint64_t != unsigned long.
* Fix compilation of pset1frombits calls on iOS.Gravatar Rasmus Munk Larsen2020-09-28
|
* Provide a more efficient Packet2l->Packet2d cast methodGravatar Christoph Hertzberg2020-09-28
|
* Added AlignedBox::transform(AffineTransform).Gravatar Martin Pecka2020-09-28
|
* Make relative path variables of type STRINGGravatar Alexander Grund2020-09-28
| | | | | | | When the type is PATH an absolute path is expected and user-defined values are converted into absolute paths relative to the current directory. Fixes #1990
* Fix Eigen::ThreadPool::CurrentThreadId returning wrong thread id when ↵Gravatar Zhuyie2020-09-25
| | | | EIGEN_AVOID_THREAD_LOCAL and NDEBUG are defined
* Fix for ROCm/HIP breakage - 200921Gravatar Deven Desai2020-09-22
| | | | | | | | | | | | | The following commit causes regressions in the ROCm/HIP support for Eigen https://gitlab.com/libeigen/eigen/-/commit/e55182ac09885d7558adf75e9e230b051a721c18 I suspect the same breakages occur on the CUDA side too. The above commit puts the EIGEN_CONSTEXPR attribute on `half_base` constructor. `half_base` is derived from `__half_raw`. When compiling with GPU support, the definition of `__half_raw` gets picked up from the GPU Compiler specific header files (`hip_fp16.h`, `cuda_fp16.h`). Properly supporting the above commit would require adding the `constexpr` attribute to the `__half_raw` constructor (and other `*half*` routines) in those header files. While that is something we can explore in the future, for now we need to undo the above commit when compiling with GPU support, which is what this commit does. This commit also reverts a small change in the `raw_uint16_to_half` routine made by the above commit. Similar to the case above, that change was leading to compile errors due to the fact that `__half_raw` has a different definition when compiling with DPU support.
* Add CI configuration for ppc64leGravatar David Tellenbach2020-09-22
|
* Fix the #issue1997 and #issue1991 bug triggered by unsupport a[index](type ↵Gravatar Guoqiang QI2020-09-21
| | | | a: __i28d) ops with MSVC compiler
* Remove EIGEN_CONSTEXPR from NumTraits<boost::multiprecision::number<...>>Gravatar David Tellenbach2020-09-21
|
* Fix using FindStandardMathLibrary.cmake with -Wall (-Wunused-value) added to ↵Gravatar Павел Мацула2020-09-19
| | | | CMAKE_CXX_FLAG
* Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being ↵Gravatar Rasmus Munk Larsen2020-09-18
| | | | | | available on 32 bit x86. If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.
* Fix undefined reference to pset1frombits bug on different platformsGravatar guoqiangqi2020-09-19
|
* Rename variable to avoid shadowing of a previously declared oneGravatar David Tellenbach2020-09-18
|
* Get rid of initialization logic for blueNorm by making the computed ↵Gravatar Rasmus Munk Larsen2020-09-18
| | | | | | constants static const or constexpr. Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.
* Fix more mildly embarrassing typos in ARM intrinsics in PacketMath.h.Gravatar Rasmus Munk Larsen2020-09-18
| | | 'vmvnq_u64' does not exist for some reason.