Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Add support for MIPS SIMD (MSA) | 2018-07-06 | |
| | |||
* | Fix shadowing typedefs | 2018-07-12 | |
| | |||
* | Fix compilation regarding std::array | 2018-07-12 | |
| | |||
* | fix unused warning | 2018-07-12 | |
| | |||
* | Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate ↵ | 2018-07-12 | |
| | | | | | | places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h | ||
* | Add missing consts for rows and cols functions in SparseLU | 2018-02-10 | |
| | |||
* | remove double ;; | 2018-07-12 | |
| | |||
* | bug #1570: fix warning | 2018-07-12 | |
| | |||
* | Merged in deven-amd/eigen (pull request PR-402) | 2018-07-12 | |
|\ | | | | | | | Adding support for using Eigen in HIP kernels. | ||
* | | Remove useless specialization thanks to is_convertible being more robust. | 2018-07-12 | |
| | | |||
* | | spellcheck | 2018-07-12 | |
| | | |||
* | | Make is_convertible more robust and conformant to std::is_convertible | 2018-07-12 | |
| | | |||
* | | Optimize the product of a householder-sequence with the identity, and ↵ | 2018-07-11 | |
| | | | | | | | | optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product. | ||
* | | Fix regression in 9357838f94d2907996adadc7e5200376f3561ed4 | 2018-07-11 | |
| | | |||
* | | Fix double ;; | 2018-07-11 | |
| | | |||
| * | Updates corresponding to the latest round of PR feedback | 2018-07-11 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms. | ||
| * | renaming CUDA* to GPU* for some header files | 2018-07-11 | |
| | | |||
| * | merging updates from upstream | 2018-07-11 | |
| |\ | |/ |/| | |||
* | | Optimize extraction of Q in SparseQR by exploiting the structure of the ↵ | 2018-07-11 | |
| | | | | | | | | identity matrix. | ||
* | | Add internall::is_identity compile-time helper | 2018-07-11 | |
| | | |||
* | | Fix conversion warning | 2018-07-10 | |
| | | |||
* | | bug #1543: improve linear indexing for general block expressions | 2018-07-10 | |
| | | |||
* | | Introduce the macro ei_declare_local_nested_eval to help allocating on the ↵ | 2018-07-09 | |
| | | | | | | | | | | | | stack local temporaries via alloca, and let outer-products makes a good use of it. If successful, we should use it everywhere nested_eval is used to declare local dense temporaries. | ||
* | | Skip null numerators in triangular-vector-solve (as in BLAS TRSV). | 2018-07-09 | |
| | | |||
* | | Fix legitimate "declaration shadows a typedef" warning | 2018-07-09 | |
| | | |||
* | | Fix the Packet16h version of ptranspose | 2018-06-16 | |
| | | | | | | | | | | | | | | | | | | | | | | The AVX512 version of ptranpose for PacketBlock<Packet16h,16> was reordering the PacketBlock argument incorrectly. This lead to errors in the multiplication of matrices composed of 16 bit floats on AVX512 machines, if at least of the matrices was using RowMajor order. This error is responsible for one tensorflow unit test failure on AVX512 machines: //tensorflow/python/kernel_tests:batch_matmul_op_test | ||
* | | Fix a few issues with Packet16h | 2018-07-07 | |
| | | |||
* | | complete implementation of Packet16h (AVX512) | 2018-07-06 | |
| | | |||
* | | Complete Packet8h implementation and test it in packetmath unit test | 2018-07-06 | |
| | | |||
| * | updates based on PR feedback | 2018-06-14 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)` | ||
| * | moving Half headers from CUDA dir to GPU dir, removing the HIP versions | 2018-06-13 | |
| | | |||
| * | syncing this fork with upstream | 2018-06-13 | |
| |\ | |||
* | | | Extend CUDA support to matrix inversion and selfadjointeigensolver | 2018-06-11 | |
| | | | |||
* | | | bug #1565: help MSVC to generatenot too bad ASM in reductions. | 2018-07-05 | |
| | | | |||
* | | | Implement custom inplace triangular product to avoid a temporary | 2018-07-03 | |
| | | | |||
* | | | Make is_same_dense compatible with different scalar types. | 2018-07-03 | |
| | | | |||
* | | | Fix regression in changeset f05dea6b2326836e5e0243fbaffbece84b833d64 | 2018-07-02 | |
| | | | | | | | | | | | | : computeFromHessenberg can take any expression for matrixQ, not only an HouseholderSequence. | ||
* | | | Simplify redux_evaluator using inheritance, and properly rename parameters ↵ | 2018-07-02 | |
| | | | | | | | | | | | | in reducers. | ||
* | | | bug #1562: optimize evaluation of small products of the form s*A*B by ↵ | 2018-07-02 | |
| | | | | | | | | | | | | rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x... | ||
* | | | update comment | 2018-06-29 | |
| | | | |||
* | | | Fix order of EIGEN_DEVICE_FUNC and returned type | 2018-06-28 | |
| | | | |||
* | | | First step towards a generic vectorised quaternion product | 2018-06-25 | |
| | | | |||
* | | | bug #1560 fix product with a 1x1 diagonal matrix | 2018-06-25 | |
| | | | |||
* | | | Fix typo in pbend for AltiVec. | 2018-06-22 | |
| |/ |/| | |||
* | | Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403) | 2018-06-11 | |
|\ \ | | | | | | | | | | | | | | | | Derivative of the incomplete Gamma function and the sample of a Gamma random variable Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com> | ||
* | | | bug #1531: expose NumDimensions for solve and sparse expressions. | 2018-06-08 | |
| | | | |||
* | | | bug #1531: expose NumDimensions for compatibility with Tensor | 2018-06-08 | |
| | | | |||
* | | | bug #1550: prevent avoidable memory allocation in RealSchur | 2018-06-08 | |
| | | | |||
* | | | Don't use std::equal_to inside cuda kernels since it's not supported. | 2018-06-07 | |
| | | | |||
* | | | Missing line during manual rebase of PR-374 | 2018-06-07 | |
| | | |