Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | bug #1578: Improve prefetching in matrix multiplication on MIPS. | 2018-07-24 | |
| | |||
* | Re-enable FMA for fast sqrt functions | 2018-07-30 | |
| | |||
* | Fix AVX512 implementations of psqrt | 2018-06-25 | |
| | | | | | | | | | | | | | This commit fixes the AVX512 implementations of psqrt in the same way that 3ed67cb0bb4af65fbf243df598604a8c7630bf7d fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in 3ed67cb0bb4af65fbf243df598604a8c7630bf7d shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original. | ||
* | Add pcast packet op for NEON. | 2018-07-26 | |
| | |||
* | Fixed issue which made documentation not getting built anymore | 2018-07-24 | |
| | |||
* | fix typo | 2018-07-23 | |
| | |||
* | Add lastN shorcuts to seq/seqN. | 2018-07-23 | |
| | |||
* | Disable type traits for stdlibc++ <= 4.9.3 | 2018-07-20 | |
| | |||
* | Fix IsRelocatable without C++11 | 2018-07-19 | |
| | |||
* | Fix determination of EIGEN_HAS_TYPE_TRAITS | 2018-07-19 | |
| | |||
* | Fix stupid error in Quaternion move ctor | 2018-07-19 | |
| | |||
* | Add MIPS changes missing from previous merge. | 2018-07-18 | |
| | |||
* | Disable type traits for GCC < 5.1.0 | 2018-07-18 | |
| | |||
* | bug #1432: fix conservativeResize for non-relocatable scalar types. For ↵ | 2018-07-18 | |
| | | | | those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable | ||
* | bug #1575: fix regression introduced in bug #1573 patch. Move ↵ | 2018-07-18 | |
| | | | | ctor/assignment should not be defaulted. | ||
* | More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA | 2018-07-18 | |
| | |||
* | applying EIGEN_DECLARE_TEST to *gpu* tests | 2018-07-17 | |
| | | | | | | | | | | | | | Also, a few minor fixes for GPU tests running in HIP mode. 1. Adding an include for hip/hip_runtime.h in the Macros.h file For HIP __host__ and __device__ are macros which are defined in hip headers. Their definitions need to be included before their use in the file. 2. Fixing the compile failure in TensorContractionGpu introduced by the commit to "Fuse computations into the Tensor contractions using output kernel" 3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit | ||
* | bug #1573: add noexcept move constructor and move assignment operator to ↵ | 2018-07-17 | |
| | | | | Quaternion | ||
* | Some warning fixes | 2018-07-17 | |
| | |||
* | bug #1572: use c++11 atomic instead of volatile if c++11 is available, and ↵ | 2018-07-17 | |
| | | | | disable multi-threaded GEMM on non-x86 without c++11. | ||
* | Fix GeneralizedEigenSolver when requesting for eigenvalues only. | 2018-07-14 | |
| | |||
* | Relax the condition to not only work on Android. | 2018-07-13 | |
| | |||
* | Clang produces incorrect Thumb2 assembler when using alloca. | 2018-07-13 | |
| | | | | Don't define EIGEN_ALLOCA when generating Thumb with clang. | ||
* | bug #1571: fix is_convertible<from,to> with "from" a reference. | 2018-07-13 | |
| | |||
* | Forward declaring std::array does not work with all std libs, so let's just ↵ | 2018-07-13 | |
| | | | | include <array> | ||
* | Add support for MIPS SIMD (MSA) | 2018-07-06 | |
| | |||
* | Fix shadowing typedefs | 2018-07-12 | |
| | |||
* | Fix compilation regarding std::array | 2018-07-12 | |
| | |||
* | fix unused warning | 2018-07-12 | |
| | |||
* | Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate ↵ | 2018-07-12 | |
| | | | | | | places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h | ||
* | Add missing consts for rows and cols functions in SparseLU | 2018-02-10 | |
| | |||
* | remove double ;; | 2018-07-12 | |
| | |||
* | bug #1570: fix warning | 2018-07-12 | |
| | |||
* | Merged in deven-amd/eigen (pull request PR-402) | 2018-07-12 | |
|\ | | | | | | | Adding support for using Eigen in HIP kernels. | ||
* | | Remove useless specialization thanks to is_convertible being more robust. | 2018-07-12 | |
| | | |||
* | | spellcheck | 2018-07-12 | |
| | | |||
* | | Make is_convertible more robust and conformant to std::is_convertible | 2018-07-12 | |
| | | |||
* | | Optimize the product of a householder-sequence with the identity, and ↵ | 2018-07-11 | |
| | | | | | | | | optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product. | ||
* | | Fix regression in 9357838f94d2907996adadc7e5200376f3561ed4 | 2018-07-11 | |
| | | |||
* | | Fix double ;; | 2018-07-11 | |
| | | |||
| * | Updates corresponding to the latest round of PR feedback | 2018-07-11 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms. | ||
| * | renaming CUDA* to GPU* for some header files | 2018-07-11 | |
| | | |||
| * | merging updates from upstream | 2018-07-11 | |
| |\ | |/ |/| | |||
* | | Optimize extraction of Q in SparseQR by exploiting the structure of the ↵ | 2018-07-11 | |
| | | | | | | | | identity matrix. | ||
* | | Add internall::is_identity compile-time helper | 2018-07-11 | |
| | | |||
* | | Fix conversion warning | 2018-07-10 | |
| | | |||
* | | bug #1543: improve linear indexing for general block expressions | 2018-07-10 | |
| | | |||
* | | Introduce the macro ei_declare_local_nested_eval to help allocating on the ↵ | 2018-07-09 | |
| | | | | | | | | | | | | stack local temporaries via alloca, and let outer-products makes a good use of it. If successful, we should use it everywhere nested_eval is used to declare local dense temporaries. | ||
* | | Skip null numerators in triangular-vector-solve (as in BLAS TRSV). | 2018-07-09 | |
| | | |||
* | | Fix legitimate "declaration shadows a typedef" warning | 2018-07-09 | |
| | |