aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen
Commit message (Collapse)AuthorAge
* Drop EIGEN_USING_STD_MATH in favour of EIGEN_USING_STDGravatar David Tellenbach2020-10-09
|
* Implement generic bitwise logical packet ops that work for all types.Gravatar Rasmus Munk Larsen2020-10-08
|
* Don't make assumptions about NaN-propagation for pmin/pmax - it various ↵Gravatar Rasmus Munk Larsen2020-10-07
| | | | | | across platforms. Change test to only test for NaN-propagation for pfmin/pfmax.
* Use reinterpret_cast instead of C-style cast in Inverse_NEON.hGravatar David Tellenbach2020-10-04
|
* Don't cast away const in Inverse_NEON.h.Gravatar Rasmus Munk Larsen2020-10-02
|
* Use EIGEN_USING_STD to fix CUDA compilation error on BFloat16.h.Gravatar Rasmus Munk Larsen2020-10-02
|
* Fix CUDA build breakage and incorrect result for absdiff on HIP with long ↵Gravatar Rasmus Munk Larsen2020-10-02
| | | | double arguments.
* dont use =* might not return a ScalarGravatar janos2020-10-02
|
* Fix build breakage with MSVC 2019, which does not support MMX intrinsics for ↵Gravatar Rasmus Munk Larsen2020-10-01
| | | | | | | | 64 bit builds, see: https://stackoverflow.com/questions/60933486/mmx-intrinsics-like-mm-cvtpd-pi32-not-found-with-msvc-2019-for-64bit-targets-c Instead use the equivalent SSE2 intrinsics.
* Add a generic packet ops corresponding to {std}::fmin and {std}::fmax. The ↵Gravatar Rasmus Munk Larsen2020-10-01
| | | | non-sensical NaN-propagation rules for std::min std::max implemented by pmin and pmax in Eigen is a longstanding source og confusion and bug report. This change is a first step towards addressing it, as discussing in issue #564.
* Specialize pldexp_double and pfdexp_double and get rid of Packet2l ↵Gravatar Rasmus Munk Larsen2020-09-30
| | | | definition for SSE. SSE does not support conversion between 64 bit integers and double and the existing implementation of casting between Packet2d and Packer2l results in undefined behavior when casting NaN to int. Since pldexp and pfdexp only manipulate exponent fields that fit in 32 bit, this change provides specializations that use existing instructions _mm_cvtpd_pi32 and _mm_cvtsi32_pd instead.
* Fix alignedbox 32-bit precision test failure.Gravatar Antonio Sanchez2020-09-30
| | | | | | | | | | | | | | | The current `test/geo_alignedbox` tests fail on 32-bit arm due to small floating-point errors. In particular, the following is not guaranteed to hold: ``` IsometryTransform identity = IsometryTransform::Identity(); BoxType transformedC; transformedC.extend(c.transformed(identity)); VERIFY(transformedC.contains(c)); ``` since `c.transformed(identity)` is ever-so-slightly different from `c`. Instead, we replace this test with one that checks an identity transform is within floating-point precision of `c`. Also updated the condition on `AlignedBox::transform(...)` to only accept `Affine`, `AffineCompact`, and `Isometry` modes explicitly. Otherwise, invalid combinations of modes would also incorrectly pass the assertion.
* Fix failure in GEBP kernel when compiling with OpenMP and FMAGravatar David Tellenbach2020-09-30
| | | | Fixes #1995
* Revert !182.Gravatar Rasmus Munk Larsen2020-09-29
|
* Add missing newline at the end of Inverse_NEON.hGravatar Rasmus Munk Larsen2020-09-29
|
* Fix compilation of 64 bit constant arguments to pset1frombits in ↵Gravatar Rasmus Munk Larsen2020-09-28
| | | | TypeCasting.h on platforms where uint64_t != unsigned long.
* Fix compilation of pset1frombits calls on iOS.Gravatar Rasmus Munk Larsen2020-09-28
|
* Provide a more efficient Packet2l->Packet2d cast methodGravatar Christoph Hertzberg2020-09-28
|
* Added AlignedBox::transform(AffineTransform).Gravatar Martin Pecka2020-09-28
|
* Fix for ROCm/HIP breakage - 200921Gravatar Deven Desai2020-09-22
| | | | | | | | | | | | | The following commit causes regressions in the ROCm/HIP support for Eigen https://gitlab.com/libeigen/eigen/-/commit/e55182ac09885d7558adf75e9e230b051a721c18 I suspect the same breakages occur on the CUDA side too. The above commit puts the EIGEN_CONSTEXPR attribute on `half_base` constructor. `half_base` is derived from `__half_raw`. When compiling with GPU support, the definition of `__half_raw` gets picked up from the GPU Compiler specific header files (`hip_fp16.h`, `cuda_fp16.h`). Properly supporting the above commit would require adding the `constexpr` attribute to the `__half_raw` constructor (and other `*half*` routines) in those header files. While that is something we can explore in the future, for now we need to undo the above commit when compiling with GPU support, which is what this commit does. This commit also reverts a small change in the `raw_uint16_to_half` routine made by the above commit. Similar to the case above, that change was leading to compile errors due to the fact that `__half_raw` has a different definition when compiling with DPU support.
* Fix the #issue1997 and #issue1991 bug triggered by unsupport a[index](type ↵Gravatar Guoqiang QI2020-09-21
| | | | a: __i28d) ops with MSVC compiler
* Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being ↵Gravatar Rasmus Munk Larsen2020-09-18
| | | | | | available on 32 bit x86. If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.
* Fix undefined reference to pset1frombits bug on different platformsGravatar guoqiangqi2020-09-19
|
* Rename variable to avoid shadowing of a previously declared oneGravatar David Tellenbach2020-09-18
|
* Get rid of initialization logic for blueNorm by making the computed ↵Gravatar Rasmus Munk Larsen2020-09-18
| | | | | | constants static const or constexpr. Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.
* Fix more mildly embarrassing typos in ARM intrinsics in PacketMath.h.Gravatar Rasmus Munk Larsen2020-09-18
| | | 'vmvnq_u64' does not exist for some reason.
* Fix typo in PacketMath.hGravatar Rasmus Munk Larsen2020-09-18
|
* Add missing packet op pcmp_lt_or_nan for Packet2d on ARM.Gravatar Rasmus Munk Larsen2020-09-18
|
* Disable double version of compute_inverse_size4 on Inverse_NEON.h if ↵Gravatar Rasmus Munk Larsen2020-09-17
| | | | Packet2d is not supported.
* Add support for CastXML on ARM aarch64Gravatar Brad King2020-09-16
| | | | | | | | CastXML simulates the preprocessors of other compilers, but actually parses the translation unit with an internal Clang compiler. Use the same `vld1q_u64` workaround that we do for Clang. Fixes: #1979
* Fix compiler error due to c++20 operator== generation rulesGravatar daravi2020-09-16
|
* Remove old Clang compiler bug work-arounds. The two LLVM bugs referenced in ↵Gravatar Benoit Jacob2020-09-15
| | | | the comments here have long been fixed. The workarounds were now detrimental because (1) they prevented using fused mul-add on Clang/ARM32 and (2) the unnecessary 'volatile' in 'asm volatile' prevented legitimate reordering by the compiler.
* Make bfloat16(float(-nan)) produce -nan, not nan.Gravatar Tim Shen2020-09-15
|
* Add plog ops support packet2d for NEONGravatar Guoqiang QI2020-09-15
|
* Add EIGEN_UNUSED_VARIABLE to unused variable in Memory.hGravatar Rasmus Munk Larsen2020-09-15
|
* Fix bfloat16 round on gcc 4.8Gravatar Pedro Caldeira2020-09-14
|
* Fix issue #1968. Don't discard return value from "new" in C++17.Gravatar Rasmus Munk Larsen2020-09-13
|
* Unified sse pldexp_double apiGravatar Guoqiang QI2020-09-12
|
* Make blueNorm threadsafe if C++11 atomics are available.Gravatar Rasmus Munk Larsen2020-09-12
|
* Fix half_impl::float_to_half_rtne(float) warning: '<<' causes overflowGravatar Niels Dekker2020-09-10
| | | | | | Fixed Visual Studio 2019 Code Analysis (C++ Core Guidelines) warning C26450 from inside `half_impl::float_to_half_rtne(float)`: > Arithmetic overflow: '<<' operation causes overflow at compile time.
* Add missing functions for Packet8bf in Altivec architecture.Gravatar Pedro Caldeira2020-09-08
| | | | | Including new tests for bfloat16 Packets. Fix prsqrt on GenericPacketMath.
* Add Neon psqrt<Packet2d> and pexp<Packet2d>Gravatar Guoqiang QI2020-09-08
|
* remove semi triggering -Wextra-semi-stmtGravatar Alexander Neumann2020-09-07
|
* Add Inverse_NEON.hGravatar Stephen Zheng2020-09-04
| | | | | | | | | | | Implemented fast size-4 matrix inverse (mimicking Inverse_SSE.h) using NEON intrinsics. ``` Benchmark Time CPU Time Old Time New CPU Old CPU New -------------------------------------------------------------------------------------------------------- BM_float -0.1285 -0.1275 568 495 572 499 BM_double -0.2265 -0.2254 638 494 641 496 ```
* MatrixProuct enhancements:Gravatar Everton Constantino2020-09-02
| | | | | | | | | | | | | - Changes to Altivec/MatrixProduct Adapting code to gcc 10. Generic code style and performance enhancements. Adding PanelMode support. Adding stride/offset support. Enabling float64, std::complex and std::complex. Fixing lack of symm_pack. Enabling mixedtypes. - Adding std::complex tests to blasutil. - Adding an implementation of storePacketBlock when Incr!= 1.
* Changing u/int8_t to un/signed char because clang does not understandGravatar Everton Constantino2020-09-02
| | | | | | it. Implementing pcmp_eq to Packet8 and Packet16.
* fix #1901: warning in Mode==(Upper|Lower)Gravatar Gael Guennebaud2020-09-02
|
* Change Packet8s and Packet8us to use vector commands on Power for pmadd, ↵Gravatar Chip Kerchner2020-08-28
| | | | pmul and psub.
* Fix #1974: assertion when reserving an empty sparse matrixGravatar Gael Guennebaud2020-08-26
|
* add psqrt ops support packet2f/packet4f for NEONGravatar Guoqiang QI2020-08-21
|