aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
* Added an option to enable the use of the F16C instruction setGravatar Benoit Steiner2016-04-21
|
* Use EIGEN_THREAD_YIELD instead of std::this_thread::yield to make the code ↵Gravatar Benoit Steiner2016-04-21
| | | | more portable.
* Don't crash when attempting to reduce empty tensors.Gravatar Benoit Steiner2016-04-20
|
* Added more testsGravatar Benoit Steiner2016-04-20
|
* Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when ↵Gravatar Benoit Steiner2016-04-20
| | | | compiling with clang since it's unclear which versions of clang actually support these instruction.
* Started to implement a portable way to yield.Gravatar Benoit Steiner2016-04-19
|
* Made sure all the required header files are included when trying to use fp16Gravatar Benoit Steiner2016-04-19
|
* Implemented a more portable version of thread local variablesGravatar Benoit Steiner2016-04-19
|
* Fixed a few typosGravatar Benoit Steiner2016-04-19
|
* Fixed a compilation error with nvcc 7.Gravatar Benoit Steiner2016-04-19
|
* Simplified the code that launches cuda kernels.Gravatar Benoit Steiner2016-04-19
|
* Don't take the address of a kernel on CUDA devices that don't support this ↵Gravatar Benoit Steiner2016-04-19
| | | | feature.
* Use numext::ceil instead of std::ceilGravatar Benoit Steiner2016-04-19
|
* Avoid an unnecessary copy of the evaluator.Gravatar Benoit Steiner2016-04-19
|
* Fixed 2 recent regression testsGravatar Benoit Steiner2016-04-19
|
* Use DenseIndex in the MeanReducer to avoid overflows when processing very ↵Gravatar Benoit Steiner2016-04-19
| | | | large tensors.
* Worked around the lack of a rand_r function on windows systemsGravatar Benoit Steiner2016-04-17
|
* Worked around the lack of a rand_r function on windows systemsGravatar Benoit Steiner2016-04-17
|
* Enable lazy-coeff-based-product for vector*(1x1) productsGravatar Gael Guennebaud2016-04-16
|
* Move the evalGemm method into the TensorContractionEvaluatorBase class to ↵Gravatar Benoit Steiner2016-04-15
| | | | make it accessible from both the single and multithreaded contraction evaluators.
* Deleted extraneous comma.Gravatar Benoit Steiner2016-04-15
|
* Deleted unnecessary variableGravatar Benoit Steiner2016-04-15
|
* Fixed a few compilation warningsGravatar Benoit Steiner2016-04-15
|
* Merged in rmlarsen/eigen (pull request PR-178)Gravatar Benoit Steiner2016-04-15
|\ | | | | | | Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions.
* | bug #1203: by-pass large stack-allocation in stableNorm if ↵Gravatar Gael Guennebaud2016-04-15
| | | | | | | | EIGEN_STACK_ALLOCATION_LIMIT is too small
| * Get rid of void* casting when calling EvalRange::run.Gravatar Rasmus Munk Larsen2016-04-15
| |
* | Fixed compilation errors with msvcGravatar Benoit Steiner2016-04-15
| |
* | Improved the matrix multiplication blocking in the case where mr is not a ↵Gravatar Benoit Steiner2016-04-15
| | | | | | | | power of 2 (e.g on Haswell CPUs).
* | Fix trmv for mixing types.Gravatar Gael Guennebaud2016-04-15
| |
* | Added ability to access the cache sizes from the tensor devicesGravatar Benoit Steiner2016-04-14
| |
* | Added support for exclusive orGravatar Benoit Steiner2016-04-14
| |
| * Eigen Tensor cost model part 2: Thread scheduling for standard evaluators ↵Gravatar Rasmus Munk Larsen2016-04-14
| | | | | | | | and reductions. The cost model is turned off by default.
* | Added missing definition of PacketSize in the gpu evaluator of convolutionGravatar Benoit Steiner2016-04-14
| |
* | Merged in rmlarsen/eigen (pull request PR-177)Gravatar Benoit Steiner2016-04-14
|\| | | | | | | Eigen Tensor cost model part 1.
* | Enabled the new threadpool testsGravatar Benoit Steiner2016-04-14
| |
* | CleanupGravatar Benoit Steiner2016-04-14
| |
* | Prepared the migration to the new non blocking thread poolGravatar Benoit Steiner2016-04-14
| |
| * Improvements to cost model.Gravatar Rasmus Munk Larsen2016-04-14
| |
* | Merged latest updates from trunkGravatar Benoit Steiner2016-04-14
|\ \
* | | Added tests for the non blocking thread poolGravatar Benoit Steiner2016-04-14
| | |
* | | Added a more scalable non blocking thread poolGravatar Benoit Steiner2016-04-14
| | |
| | * Merge upstream updates.Gravatar Rasmus Munk Larsen2016-04-14
| | |\ | | |/ | |/|
| | * Eigen cost model part 1. This implements a basic recursive framework to ↵Gravatar Rasmus Munk Larsen2016-04-14
| | | | | | | | | | | | estimate the cost of evaluating tensor expressions.
| * | Add extreme values to the imaginary part for SVD unit tests.Gravatar Gael Guennebaud2016-04-14
| | |
| * | Improve numerical robustness of JacoviSVD:Gravatar Gael Guennebaud2016-04-14
|/ / | | | | | | | | - avoid noise amplification in complex to real conversion - compare off-diagonal entries to the current biggest diagonal entry: no need to bother about a 2x2 block containing ridiculously small entries compared to the rest of the matrix.
* | Force the inlining of the << operator on half floatsGravatar Benoit Steiner2016-04-14
| |
* | Inline the << operator on half floatsGravatar Benoit Steiner2016-04-14
| |
* | Silenced a compilation warningGravatar Benoit Steiner2016-04-14
| |
* | Added tests to validate flooring and ceiling of fp16Gravatar Benoit Steiner2016-04-14
| |
* | Added simple test for numext::sqrt and numext::pow on fp16Gravatar Benoit Steiner2016-04-14
| |