Commit message (Collapse) | Author | Age | ||
---|---|---|---|---|
... | ||||
* | | Fixed include path | Benoit Steiner | 2016-04-29 | |
| | | ||||
* | | Fix compilation of sparse.cast<>().transpose(). | Gael Guennebaud | 2016-04-29 | |
| | | ||||
* | | Fixed a few memory leaks | Benoit Steiner | 2016-04-28 | |
| | | ||||
* | | Fixed the igamma and igammac implementations to make them callable from a ↵ | Benoit Steiner | 2016-04-28 | |
| | | | | | | | | gpu kernel. | |||
* | | Deleted unused variable | Benoit Steiner | 2016-04-28 | |
| | | ||||
* | | Eliminate mutual recursion in igamma{,c}_impl::Run. | Justin Lebar | 2016-04-28 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Presently, igammac_impl::Run calls igamma_impl::Run, which in turn calls igammac_impl::Run. This isn't actually mutual recursion; the calls are guarded such that we never get into a loop. Nonetheless, it's a stretch for clang to prove this. As a result, clang emits a recursive call in both igammac_impl::Run and igamma_impl::Run. That this is suboptimal code is bad enough, but it's particularly bad when compiling for CUDA/nvptx. nvptx allows recursion, but only begrudgingly: If you have recursive calls in a kernel, it's on you to manually specify the kernel's stack size. Otherwise, ptxas will dump a warning, make a guess, and who knows if it's right. This change explicitly eliminates the mutual recursion in igammac_impl::Run and igamma_impl::Run. | |||
* | | Fixed compilation error with clang. | Benoit Steiner | 2016-04-27 | |
| | | ||||
* | | Merged in rmlarsen/eigen2 (pull request PR-183) | Benoit Steiner | 2016-04-27 | |
|\ \ | | | | | | | | | | Detect cxx_constexpr support when compiling with clang. | |||
| * | | Depend on the more extensive support for constexpr in clang: | Rasmus Munk Larsen | 2016-04-27 | |
| | | | | | | | | | | | | http://clang.llvm.org/docs/LanguageExtensions.html#c-1y-relaxed-constexpr | |||
| * | | Detect cxx_constexpr support when compiling with clang. | Rasmus Munk Larsen | 2016-04-27 | |
| | | | ||||
* | | | Merged latest update from trunk | Benoit Steiner | 2016-04-27 | |
|\| | | ||||
* | | | fpclassify isn't portable enough. In particular, the return values of the ↵ | Benoit Steiner | 2016-04-27 | |
| | | | | | | | | | | | | function are not available on all the platforms Eigen supportes: remove it from Eigen. | |||
| * | | Fix missing inclusion of Eigen/Core | Gael Guennebaud | 2016-04-27 | |
| | | | ||||
* | | | Made the index type a template parameter to evaluateProductBlockingSizes | Benoit Steiner | 2016-04-27 | |
|/ / | | | | | | | Use numext::mini and numext::maxi instead of std::min/std::max to compute blocking sizes. | |||
* | | Merged latest updates from trunk | Benoit Steiner | 2016-04-27 | |
|\ \ | ||||
* | | | Improved support for min and max on 16 bit floats when running on recent ↵ | Benoit Steiner | 2016-04-27 | |
| | | | | | | | | | | | | cuda gpus | |||
| * | | Merged eigen/eigen into default | Rasmus Larsen | 2016-04-27 | |
| |\ \ | ||||
| * | | | Use computeProductBlockingSizes to compute blocking for both ShardByCol and ↵ | Rasmus Munk Larsen | 2016-04-27 | |
| | | | | | | | | | | | | | | | | ShardByRow cases. | |||
* | | | | Added support for fpclassify in Eigen::Numext | Benoit Steiner | 2016-04-27 | |
| |/ / |/| | | ||||
* | | | Implement stricter argument checking for SYRK and SY2K and real matrices. To ↵ | Rasmus Munk Larsen | 2016-04-27 | |
|/ / | | | | | | | implement the BLAS API they should return info=2 if op='C' is passed for a complex matrix. Without this change, the Eigen BLAS fails the strict zblat3 and cblat3 tests in LAPACK 3.5. | |||
* | | Refactor the unsupported CXX11/Core module to internal headers only. | Gael Guennebaud | 2016-04-26 | |
| | | ||||
* | | Fixed the partial evaluation of non vectorizable tensor subexpressions | Benoit Steiner | 2016-04-25 | |
| | | ||||
* | | Refined the cost of the striding operation. | Benoit Steiner | 2016-04-25 | |
| | | ||||
* | | Merged in rmlarsen/eigen (pull request PR-179) | Benoit Steiner | 2016-04-21 | |
|\ \ | | | | | | | | | | Prevent crash in CompleteOrthogonalDecomposition if object was default constructed. | |||
* | | | Provide access to the base threadpool classes | Benoit Steiner | 2016-04-21 | |
| | | | ||||
| * | | Prevent crash in CompleteOrthogonalDecomposition if object was default ↵ | Rasmus Munk Larsen | 2016-04-21 | |
| | | | | | | | | | | | | constructed. | |||
* | | | Added the ability to switch to the new thread pool with a #define | Benoit Steiner | 2016-04-21 | |
| | | | ||||
* | | | Use index list for the striding benchmarks | Benoit Steiner | 2016-04-21 | |
| | | | ||||
* | | | Fixed several compilation warnings | Benoit Steiner | 2016-04-21 | |
| | | | ||||
* | | | Added an option to enable the use of the F16C instruction set | Benoit Steiner | 2016-04-21 | |
| | | | ||||
* | | | Use EIGEN_THREAD_YIELD instead of std::this_thread::yield to make the code ↵ | Benoit Steiner | 2016-04-21 | |
| | | | | | | | | | | | | more portable. | |||
* | | | Don't crash when attempting to reduce empty tensors. | Benoit Steiner | 2016-04-20 | |
| | | | ||||
* | | | Added more tests | Benoit Steiner | 2016-04-20 | |
| | | | ||||
* | | | Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when ↵ | Benoit Steiner | 2016-04-20 | |
| | | | | | | | | | | | | compiling with clang since it's unclear which versions of clang actually support these instruction. | |||
* | | | Started to implement a portable way to yield. | Benoit Steiner | 2016-04-19 | |
| | | | ||||
* | | | Made sure all the required header files are included when trying to use fp16 | Benoit Steiner | 2016-04-19 | |
| | | | ||||
* | | | Implemented a more portable version of thread local variables | Benoit Steiner | 2016-04-19 | |
| | | | ||||
* | | | Fixed a few typos | Benoit Steiner | 2016-04-19 | |
| | | | ||||
* | | | Fixed a compilation error with nvcc 7. | Benoit Steiner | 2016-04-19 | |
| | | | ||||
* | | | Simplified the code that launches cuda kernels. | Benoit Steiner | 2016-04-19 | |
| | | | ||||
* | | | Don't take the address of a kernel on CUDA devices that don't support this ↵ | Benoit Steiner | 2016-04-19 | |
| | | | | | | | | | | | | feature. | |||
* | | | Use numext::ceil instead of std::ceil | Benoit Steiner | 2016-04-19 | |
| | | | ||||
* | | | Avoid an unnecessary copy of the evaluator. | Benoit Steiner | 2016-04-19 | |
| | | | ||||
* | | | Fixed 2 recent regression tests | Benoit Steiner | 2016-04-19 | |
|/ / | ||||
* | | Use DenseIndex in the MeanReducer to avoid overflows when processing very ↵ | Benoit Steiner | 2016-04-19 | |
| | | | | | | | | large tensors. | |||
* | | Worked around the lack of a rand_r function on windows systems | Benoit Steiner | 2016-04-17 | |
| | | ||||
* | | Worked around the lack of a rand_r function on windows systems | Benoit Steiner | 2016-04-17 | |
| | | ||||
* | | Enable lazy-coeff-based-product for vector*(1x1) products | Gael Guennebaud | 2016-04-16 | |
| | | ||||
* | | Move the evalGemm method into the TensorContractionEvaluatorBase class to ↵ | Benoit Steiner | 2016-04-15 | |
| | | | | | | | | make it accessible from both the single and multithreaded contraction evaluators. | |||
* | | Deleted extraneous comma. | Benoit Steiner | 2016-04-15 | |
| | |