eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
...
\| * \|	Adding new arch/SYCL headers, used for SYCL vectorization.	Mehdi Goli	2018-08-01
\| \| \|
\| \| *	variadic version of assert which can take a parameter pack as its input.	Mehdi Goli	2018-08-01
\| \|/
\| *	bug #1578: Improve prefetching in matrix multiplication on MIPS.	Alexey Frunze	2018-07-24
\| \|
\| *	Re-enable FMA for fast sqrt functions	Mark D Ryan	2018-07-30
\| \|
\| *	Fix AVX512 implementations of psqrt	Mark D Ryan	2018-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit fixes the AVX512 implementations of psqrt in the same way that 3ed67cb0bb4af65fbf243df598604a8c7630bf7d fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in 3ed67cb0bb4af65fbf243df598604a8c7630bf7d shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original.
\| *	Add pcast packet op for NEON.	Rasmus Munk Larsen	2018-07-26
\| \|
\| *	Fixed issue which made documentation not getting built anymore	Christoph Hertzberg	2018-07-24
\| \|
\| *	fix typo	Gael Guennebaud	2018-07-23
\| \|
\| *	Add lastN shorcuts to seq/seqN.	Gael Guennebaud	2018-07-23
\| \|
\| *	Disable type traits for stdlibc++ <= 4.9.3	Eugene Zhulenev	2018-07-20
\| \|
\| *	Fix IsRelocatable without C++11	Gael Guennebaud	2018-07-19
\| \|
\| *	Fix determination of EIGEN_HAS_TYPE_TRAITS	Gael Guennebaud	2018-07-19
\| \|
\| *	Add MIPS changes missing from previous merge.	Alexey Frunze	2018-07-18
\| \|
\| *	Disable type traits for GCC < 5.1.0	Eugene Zhulenev	2018-07-18
\| \|
\| *	bug #1432: fix conservativeResize for non-relocatable scalar types. For ↵	Gael Guennebaud	2018-07-18
\| \| \| \| \| \| \| \|	those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable
\| *	applying EIGEN_DECLARE_TEST to gpu tests	Deven Desai	2018-07-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also, a few minor fixes for GPU tests running in HIP mode. 1. Adding an include for hip/hip_runtime.h in the Macros.h file For HIP __host__ and __device__ are macros which are defined in hip headers. Their definitions need to be included before their use in the file. 2. Fixing the compile failure in TensorContractionGpu introduced by the commit to "Fuse computations into the Tensor contractions using output kernel" 3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit
\| *	bug #1572: use c++11 atomic instead of volatile if c++11 is available, and ↵	Gael Guennebaud	2018-07-17
\| \| \| \| \| \| \| \|	disable multi-threaded GEMM on non-x86 without c++11.
\| *	Relax the condition to not only work on Android.	Rasmus Munk Larsen	2018-07-13
\| \|
\| *	Clang produces incorrect Thumb2 assembler when using alloca.	Rasmus Munk Larsen	2018-07-13
\| \| \| \| \| \| \| \|	Don't define EIGEN_ALLOCA when generating Thumb with clang.
\| *	bug #1571: fix is_convertible<from,to> with "from" a reference.	Gael Guennebaud	2018-07-13
\| \|
\| *	Forward declaring std::array does not work with all std libs, so let's just ↵	Gael Guennebaud	2018-07-13
\| \| \| \| \| \| \| \|	include <array>
\| *	Add support for MIPS SIMD (MSA)	Alexey Frunze	2018-07-06
\| \|
\| *	Fix compilation regarding std::array	Gael Guennebaud	2018-07-12
\| \|
\| *	fix unused warning	Gael Guennebaud	2018-07-12
\| \|
\| *	Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate ↵	Gael Guennebaud	2018-07-12
\| \| \| \| \| \| \| \| \| \| \| \|	places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h
\| *	remove double ;;	Gael Guennebaud	2018-07-12
\| \|
\| *	bug #1570: fix warning	Gael Guennebaud	2018-07-12
\| \|
\| *	Merged in deven-amd/eigen (pull request PR-402)	Gael Guennebaud	2018-07-12
\| \|\ \| \| \| \| \| \| \| \| \|	Adding support for using Eigen in HIP kernels.
\| * \|	Remove useless specialization thanks to is_convertible being more robust.	Gael Guennebaud	2018-07-12
\| \| \|
\| * \|	spellcheck	Gael Guennebaud	2018-07-12
\| \| \|
\| * \|	Make is_convertible more robust and conformant to std::is_convertible	Gael Guennebaud	2018-07-12
\| \| \|
\| * \|	Fix regression in 9357838f94d2907996adadc7e5200376f3561ed4	Gael Guennebaud	2018-07-11
\| \| \|
\| * \|	Fix double ;;	Gael Guennebaud	2018-07-11
\| \| \|
\| \| *	Updates corresponding to the latest round of PR feedback	Deven Desai	2018-07-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.
\| \| *	renaming CUDA* to GPU* for some header files	Deven Desai	2018-07-11
\| \| \|
\| \| *	merging updates from upstream	Deven Desai	2018-07-11
\| \| \|\ \| \| \|/ \| \|/\|
\| * \|	Add internall::is_identity compile-time helper	Gael Guennebaud	2018-07-11
\| \| \|
\| * \|	Fix conversion warning	Gael Guennebaud	2018-07-10
\| \| \|
\| * \|	bug #1543: improve linear indexing for general block expressions	Gael Guennebaud	2018-07-10
\| \| \|
\| * \|	Introduce the macro ei_declare_local_nested_eval to help allocating on the ↵	Gael Guennebaud	2018-07-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	stack local temporaries via alloca, and let outer-products makes a good use of it. If successful, we should use it everywhere nested_eval is used to declare local dense temporaries.
\| * \|	Skip null numerators in triangular-vector-solve (as in BLAS TRSV).	Gael Guennebaud	2018-07-09
\| \| \|
\| * \|	Fix legitimate "declaration shadows a typedef" warning	Gael Guennebaud	2018-07-09
\| \| \|
\| * \|	Fix the Packet16h version of ptranspose	Mark D Ryan	2018-06-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The AVX512 version of ptranpose for PacketBlock<Packet16h,16> was reordering the PacketBlock argument incorrectly. This lead to errors in the multiplication of matrices composed of 16 bit floats on AVX512 machines, if at least of the matrices was using RowMajor order. This error is responsible for one tensorflow unit test failure on AVX512 machines: //tensorflow/python/kernel_tests:batch_matmul_op_test
\| * \|	Fix a few issues with Packet16h	Gael Guennebaud	2018-07-07
\| \| \|
\| * \|	complete implementation of Packet16h (AVX512)	Gael Guennebaud	2018-07-06
\| \| \|
\| * \|	Complete Packet8h implementation and test it in packetmath unit test	Gael Guennebaud	2018-07-06
\| \| \|
\| \| *	updates based on PR feedback	Deven Desai	2018-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
\| \| *	moving Half headers from CUDA dir to GPU dir, removing the HIP versions	Deven Desai	2018-06-13
\| \| \|
\| \| *	syncing this fork with upstream	Deven Desai	2018-06-13
\| \| \|\
\| * \| \|	Extend CUDA support to matrix inversion and selfadjointeigensolver	Andrea Bocci	2018-06-11
\| \| \| \|