Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Pulled latest updates from upstream | Benoit Steiner | 2016-04-29 |
|\ | |||
* | | Implemented palign_impl for AVX512 | Benoit Steiner | 2016-04-29 |
| | | |||
* | | Fixed the AVX512 packet traits | Benoit Steiner | 2016-04-29 |
| | | |||
* | | Added pdiv packet primitives for avx512 | Benoit Steiner | 2016-04-29 |
| | | |||
* | | Implemented preduxp for AVX512 | Benoit Steiner | 2016-04-29 |
| | | |||
* | | Implemented the pabs and preverse primitives for avx512. | Benoit Steiner | 2016-04-29 |
| | | |||
* | | Disabled some of the AVX512 primitives on compilers that don't support them | Benoit Steiner | 2016-04-29 |
| | | |||
| * | Fix compilation of sparse.cast<>().transpose(). | Gael Guennebaud | 2016-04-29 |
| | | |||
| * | Fixed the igamma and igammac implementations to make them callable from a ↵ | Benoit Steiner | 2016-04-28 |
| | | | | | | | | gpu kernel. | ||
| * | Deleted unused variable | Benoit Steiner | 2016-04-28 |
| | | |||
| * | Eliminate mutual recursion in igamma{,c}_impl::Run. | Justin Lebar | 2016-04-28 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Presently, igammac_impl::Run calls igamma_impl::Run, which in turn calls igammac_impl::Run. This isn't actually mutual recursion; the calls are guarded such that we never get into a loop. Nonetheless, it's a stretch for clang to prove this. As a result, clang emits a recursive call in both igammac_impl::Run and igamma_impl::Run. That this is suboptimal code is bad enough, but it's particularly bad when compiling for CUDA/nvptx. nvptx allows recursion, but only begrudgingly: If you have recursive calls in a kernel, it's on you to manually specify the kernel's stack size. Otherwise, ptxas will dump a warning, make a guess, and who knows if it's right. This change explicitly eliminates the mutual recursion in igammac_impl::Run and igamma_impl::Run. | ||
| * | Merged in rmlarsen/eigen2 (pull request PR-183) | Benoit Steiner | 2016-04-27 |
| |\ | | | | | | | | | | Detect cxx_constexpr support when compiling with clang. | ||
| | * | Depend on the more extensive support for constexpr in clang: | Rasmus Munk Larsen | 2016-04-27 |
| | | | | | | | | | | | | http://clang.llvm.org/docs/LanguageExtensions.html#c-1y-relaxed-constexpr | ||
| | * | Detect cxx_constexpr support when compiling with clang. | Rasmus Munk Larsen | 2016-04-27 |
| | | | |||
| * | | fpclassify isn't portable enough. In particular, the return values of the ↵ | Benoit Steiner | 2016-04-27 |
| | | | | | | | | | | | | function are not available on all the platforms Eigen supportes: remove it from Eigen. | ||
| * | | Made the index type a template parameter to evaluateProductBlockingSizes | Benoit Steiner | 2016-04-27 |
| |/ | | | | | | | Use numext::mini and numext::maxi instead of std::min/std::max to compute blocking sizes. | ||
| * | Improved support for min and max on 16 bit floats when running on recent ↵ | Benoit Steiner | 2016-04-27 |
| | | | | | | | | cuda gpus | ||
| * | Added support for fpclassify in Eigen::Numext | Benoit Steiner | 2016-04-27 |
| | | |||
| * | Merged in rmlarsen/eigen (pull request PR-179) | Benoit Steiner | 2016-04-21 |
| |\ | | | | | | | | | | Prevent crash in CompleteOrthogonalDecomposition if object was default constructed. | ||
| | * | Prevent crash in CompleteOrthogonalDecomposition if object was default ↵ | Rasmus Munk Larsen | 2016-04-21 |
| | | | | | | | | | | | | constructed. | ||
| * | | Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when ↵ | Benoit Steiner | 2016-04-20 |
| | | | | | | | | | | | | compiling with clang since it's unclear which versions of clang actually support these instruction. | ||
| * | | Made sure all the required header files are included when trying to use fp16 | Benoit Steiner | 2016-04-19 |
| |/ | |||
| * | Enable lazy-coeff-based-product for vector*(1x1) products | Gael Guennebaud | 2016-04-16 |
| | | |||
| * | Deleted extraneous comma. | Benoit Steiner | 2016-04-15 |
| | | |||
| * | bug #1203: by-pass large stack-allocation in stableNorm if ↵ | Gael Guennebaud | 2016-04-15 |
| | | | | | | | | EIGEN_STACK_ALLOCATION_LIMIT is too small | ||
| * | Improved the matrix multiplication blocking in the case where mr is not a ↵ | Benoit Steiner | 2016-04-15 |
| | | | | | | | | power of 2 (e.g on Haswell CPUs). | ||
| * | Fix trmv for mixing types. | Gael Guennebaud | 2016-04-15 |
| | | |||
| * | Added ability to access the cache sizes from the tensor devices | Benoit Steiner | 2016-04-14 |
| | | |||
| * | Added support for exclusive or | Benoit Steiner | 2016-04-14 |
| | | |||
| * | Improve numerical robustness of JacoviSVD: | Gael Guennebaud | 2016-04-14 |
| | | | | | | | | | | - avoid noise amplification in complex to real conversion - compare off-diagonal entries to the current biggest diagonal entry: no need to bother about a 2x2 block containing ridiculously small entries compared to the rest of the matrix. | ||
| * | Force the inlining of the << operator on half floats | Benoit Steiner | 2016-04-14 |
| | | |||
| * | Inline the << operator on half floats | Benoit Steiner | 2016-04-14 |
| | | |||
| * | Added ability to printf fp16 | Benoit Steiner | 2016-04-14 |
| | | |||
| * | Cleaning pass on rcond estimator. | Gael Guennebaud | 2016-04-14 |
| | | |||
| * | Better use .data() than &coeffRef(0) | Gael Guennebaud | 2016-04-14 |
| | | |||
| * | Merged in rmlarsen/eigen (pull request PR-174) | Gael Guennebaud | 2016-04-14 |
| |\ | | | | | | | | | | Add matrix condition number estimation module. | ||
| * | | Properly gate the definition of the error and gamma functions for fp16 | Benoit Steiner | 2016-04-13 |
| | | | |||
| * | | Improved support for trigonometric functions on GPU | Benoit Steiner | 2016-04-13 |
| | | | |||
| * | | Added basic implementation of the lgamma, digamma, igamma, igammac, ↵ | Benoit Steiner | 2016-04-13 |
| | | | | | | | | | | | | polygamma, and zeta function for fp16 | ||
| * | | merge | Gael Guennebaud | 2016-04-13 |
| |\ \ | |||
| * | | | Fix JacobiSVD for complex when the complex-to-real update already gives a ↵ | Gael Guennebaud | 2016-04-13 |
| | | | | | | | | | | | | | | | | diagonal 2x2 block. | ||
| | * | | Cleaned up the implementation of digamma | Benoit Steiner | 2016-04-13 |
| | | | | |||
| | * | | Pulled latest updates from trunk | Benoit Steiner | 2016-04-13 |
| |/| | | |||
| | * | | Added support for sin, cos, tan, and tanh on fp16 | Benoit Steiner | 2016-04-13 |
| | | | | |||
| * | | | Fix underflow in JacoviSVD's complex to real preconditioner | Gael Guennebaud | 2016-04-13 |
| |/ / | |||
| * | | Added support for computing cos, sin, tan, and tanh on GPU. | Benoit Steiner | 2016-04-13 |
| | | | |||
| * | | Added constructors to convert unsigned integers into fp16 | Benoit Steiner | 2016-04-13 |
| | | | |||
| * | | Workaround a division by zero when outerstride==0 | Gael Guennebaud | 2016-04-13 |
| | | | |||
| * | | Make use of is_same_dense helper instead of extract_data to detect ↵ | Gael Guennebaud | 2016-04-13 |
| | | | | | | | | | | | | input/outputs are the same. | ||
| * | | Fix incomplete previous patch on matrix comparision. | Gael Guennebaud | 2016-04-13 |
| | | |