aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core
Commit message (Collapse)AuthorAge
* Fix missing header inclusion and colliding definitions for half type ↵Gravatar Rasmus Munk Larsen2019-08-30
| | | | | | casting, which broke build with -march=native on Haswell/Skylake.
* Add more tests for corner cases of log1p and expm1. Add handling of infinite ↵Gravatar Rasmus Munk Larsen2019-08-28
| | | | arguments to log1p such that log1p(inf) = inf.
* Revert changes to std_falback::log1p that broke handling of arguments less ↵Gravatar Rasmus Munk Larsen2019-08-27
| | | | than -1. Fix packet op accordingly.
* Clean up float16 a.k.a. Eigen::half support in Eigen. Move the definition of ↵Gravatar Rasmus Munk Larsen2019-08-27
| | | | half to Core/arch/Default and move arch-specific packet ops to their respective sub-directories.
* Merged in jaopaulolc/eigen (pull request PR-679)Gravatar Christoph Hertzberg2019-08-22
|\ | | | | | | Fixes for Altivec/VSX and compilation with clang on PowerPC
| * Fix debug macros in p{load,store}uGravatar João P. L. de Carvalho2019-08-14
| |
| * Add missing pcmp_XX methods for double/Packet2dGravatar João P. L. de Carvalho2019-08-14
| | | | | | | | This actually fixes an issue in unit-test packetmath_2 with pcmp_eq when it is compiled with clang. When pcmp_eq(Packet4f,Packet4f) is used instead of pcmp_eq(Packet2d,Packet2d), the unit-test does not pass due to NaN on ref vector.
* | Implement vectorized versions of log1p and expm1 in Eigen using Kahan's ↵Gravatar Rasmus Munk Larsen2019-08-12
| | | | | | | | | | | | | | | | | | | | | | | | formulas, and change the scalar implementations to properly handle infinite arguments. Depending on instruction set, significant speedups are observed for the vectorized path: log1p wall time is reduced 60-93% (2.5x - 15x speedup) expm1 wall time is reduced 0-85% (1x - 7x speedup) The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly. Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM
| * Fix packed load/store for PowerPC's VSXGravatar João P. L. de Carvalho2019-08-09
| | | | | | | | | | | | | | | | The vec_vsx_ld/vec_vsx_st builtins were wrongly used for aligned load/store. In fact, they perform unaligned memory access and, even when the address is 16-byte aligned, they are much slower (at least 2x) than their aligned counterparts. For double/Packet2d vec_xl/vec_xst should be prefered over vec_ld/vec_st, although the latter works when casted to float/Packet4f. Silencing some weird warning with throw but some GCC versions. Such warning are not thrown by Clang.
| * Fix offset argument of ploadu/pstoreu for AltivecGravatar João P. L. de Carvalho2019-08-09
| | | | | | | | | | | | | | | | | | | | If no offset is given, them it should be zero. Also passes full address to vec_vsx_ld/st builtins. Removes userless _EIGEN_ALIGNED_PTR & _EIGEN_MASK_ALIGNMENT. Removes unnecessary casts.
| * bug #1718: Add cast to successfully compile with clang on PowerPCGravatar João P. L. de Carvalho2019-08-09
|/ | | | Ignoring -Wc11-extensions warnings thrown by clang at Altivec/PacketMath.h
* Fix bugs in log1p and expm1 where repeated using statements would clobber ↵Gravatar Rasmus Munk Larsen2019-08-08
| | | | | | each other. Add specializations for complex types since std::log1p and std::exp1m do not support complex.
* Remove {} accidentally added in previous commitGravatar Christoph Hertzberg2019-07-18
|
* Move variadic constructors outside `#ifndef EIGEN_PARSED_BY_DOXYGEN` block, ↵Gravatar Christoph Hertzberg2019-07-12
| | | | to make it actually appear in the generated documentation.
* Build deprecated snippets with -DEIGEN_NO_DEPRECATED_WARNINGGravatar Christoph Hertzberg2019-07-12
| | | | Also, document LinSpaced only where it is implemented
* Fix compiler for unsigned integers.Gravatar Rasmus Munk Larsen2019-07-09
|
* PR 655: Fix missing Eigen namespace in MacrosGravatar Justin Carpentier2019-06-05
|
* [SYCL] Adding the SYCL memory model. The SYCL memory model provides :Gravatar Mehdi Goli2019-07-01
| | | | | * an interface for SYCL buffers to behave as a non-dereferenceable pointer * an interface for placeholder accessor to behave like a pointer on both host and device
* Fix CUDA compilation error for pselect<half>.Gravatar Rasmus Munk Larsen2019-06-28
|
* [SYCL] This PR adds the minimum modifications to Eigen core required to run ↵Gravatar Mehdi Goli2019-06-27
| | | | | | | | Eigen unsupported modules on devices supporting SYCL. * Adding SYCL memory model * Enabling/Disabling SYCL backend in Core * Supporting Vectorization
* fix for a ROCm/HIP specificcompile errror introduced by a recent commit.Gravatar Deven Desai2019-06-22
|
* Remove extra "one" in comment.Gravatar Rasmus Munk Larsen2019-06-20
|
* Update comment as suggested by tra@google.com.Gravatar Rasmus Munk Larsen2019-06-20
|
* Fix grammar.Gravatar Rasmus Munk Larsen2019-06-20
|
* Added comment explaining the surprising EIGEN_COMP_CLANG && !EIGEN_COMP_NVCC ↵Gravatar Rasmus Munk Larsen2019-06-20
| | | | clause.
* Fix CUDA build on Mac.Gravatar Rasmus Munk Larsen2019-06-20
|
* Various fixes for packet ops.Gravatar Rasmus Munk Larsen2019-06-20
| | | | | | 1. Fix buggy pcmp_eq and unit test for half types. 2. Add unit test for pselect and add specializations for SSE 4.1, AVX512, and half types. 3. Get rid of FIXME: Implement faster pnegate for half by XOR'ing with a sign bit mask.
* bug #1724: Mask buggy warnings with g++-7Gravatar Christoph Hertzberg2019-06-14
| | | | | (grafted from 427f2f66d69ae9b124c2f8bcd927fb6e19e07e91 )
* Make is_valid_index_type return false for float and double when ↵Gravatar Rasmus Munk Larsen2019-06-05
| | | | EIGEN_HAS_TYPE_TRAITS is off.
* Add workaround for choosing the right include files with FP16C support with ↵Gravatar Rasmus Munk Larsen2019-06-05
| | | | clang.
* Clean up CUDA/NVCC version macros and their use in Eigen, and a few other ↵Gravatar Rasmus Munk Larsen2019-05-31
| | | | CUDA build failures.
* fix for HIP build errors that were introduced by a commit earlier this weekGravatar Deven Desai2019-05-24
|
* GEMV: remove double declaration of constant.Gravatar Gustavo Lima Chaves2019-05-23
| | | | | | | | | | | | | That was hurting users with compilers that would object to proceed with that: """ ./Eigen/src/Core/products/GeneralMatrixVector.h:356:10: error: declaration shadows a static data member of 'general_matrix_vector_product<type-parameter-0-0, type-parameter-0-1, type-parameter-0-2, 1, ConjugateLhs, type-parameter-0-4, type-parameter-0-5, ConjugateRhs, Version>' [-Werror,-Wshadow] LhsPacketSize = Traits::LhsPacketSize, ^ ./Eigen/src/Core/products/GeneralMatrixVector.h:307:22: note: previous declaration is here static const Index LhsPacketSize = Traits::LhsPacketSize; """
* Enable support for F16C with Clang. The required intrinsics were added here: ↵Gravatar Rasmus Munk Larsen2019-05-20
| | | | | | https://reviews.llvm.org/D16177 and are part of LLVM 3.8.0.
* Merged in rmlarsen/eigen (pull request PR-643)Gravatar Rasmus Larsen2019-05-20
|\ | | | | | | | | | | Make Eigen build with cuda 10 and clang. Approved-by: Justin Lebar <justin.lebar@gmail.com>
* \ Merged in scramsby/eigen (pull request PR-646)Gravatar Gael Guennebaud2019-05-20
|\ \ | | | | | | | | | Eigen: Fix MSVC C++17 language standard detection logic
* \ \ Merged in glchaves/eigen (pull request PR-635)Gravatar Rasmus Larsen2019-05-17
|\ \ \ | | | | | | | | | | | | | | | | | | | | Speed up GEMV on AVX-512 builds, just as done for GEBP previously. Approved-by: Rasmus Larsen <rmlarsen@google.com>
| | | * Make Eigen build with cuda 10 and clang.Gravatar Rasmus Munk Larsen2019-05-15
| |_|/ |/| |
* | | Removing unused API to fix compile error in TensorFlow due toGravatar Anuj Rawat2019-05-12
| | | | | | | | | | | | AVX512VL, AVX512BW usage
* | | bug #1707: Fix deprecation warnings, or disable warnings when testing ↵Gravatar Christoph Hertzberg2019-05-10
| | | | | | | | | | | | deprecated functions
* | | Fix build with clang on Windows.Gravatar Rasmus Munk Larsen2019-05-09
| | |
* | | Fix AVX512 & GCC 6.3 compilationGravatar Eugene Zhulenev2019-05-07
| | |
* | | Restore C++03 compatibilityGravatar Christoph Hertzberg2019-05-06
| | |
* | | Fix traits for scalar_logistic_op.Gravatar Rasmus Munk Larsen2019-05-03
| | |
| | * Eigen: Fix MSVC C++17 language standard detection logicGravatar Scott Ramsby2019-05-03
| |/ |/| | | | | | | | | | | To detect C++17 support, use _MSVC_LANG macro instead of _MSC_VER. _MSC_VER can indicate whether the current compiler version could support the C++17 language standard, but not whether that standard is actually selected (i.e. via /std:c++17). See these web pages for more details: https://devblogs.microsoft.com/cppblog/msvc-now-correctly-reports-__cplusplus/ https://docs.microsoft.com/en-us/cpp/preprocessor/predefined-macros
* | Add masked_store_available to unpacket_traitsGravatar Eugene Zhulenev2019-05-02
| |
* | Add masked pstoreu for Packet16hGravatar Eugene Zhulenev2019-05-02
| |
* | Add masked pstoreu to AVX and AVX512 PacketMathGravatar Eugene Zhulenev2019-05-02
| |
* | Fix regression in changeset ae33e866c750c6c24ada5c6f7f3ec15815d0e683Gravatar Gael Guennebaud2019-05-02
| |
| * Speed up GEMV on AVX-512 builds, just as done for GEBP previously.Gravatar Gustavo Lima Chaves2019-04-26
| | | | | | | | | | | | We take advantage of smaller SIMD registers as well, in that case. Gains up to 3x for select input sizes.