aboutsummaryrefslogtreecommitdiffhomepage
path: root/unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h
Commit message (Collapse)AuthorAge
* Eigen Tensor cost model part 2: Thread scheduling for standard evaluators ↵Gravatar Rasmus Munk Larsen2016-04-14
| | | | and reductions. The cost model is turned off by default.
* Eigen cost model part 1. This implements a basic recursive framework to ↵Gravatar Rasmus Munk Larsen2016-04-14
| | | | estimate the cost of evaluating tensor expressions.
* Fixed compilation warnings on armGravatar Benoit Steiner2016-03-28
|
* Avoid unnecessary conversionsGravatar Benoit Steiner2016-03-23
|
* Fixed compilation warningGravatar Benoit Steiner2016-03-23
|
* Use a single Barrier instead of a collection of Notifications to reduce the ↵Gravatar Benoit Steiner2016-03-22
| | | | thread synchronization overhead
* Avoid implicit castGravatar Benoit Steiner2016-03-09
|
* Avoid unnecessary conversion from 32bit int to 64bit unsigned intGravatar Benoit Steiner2016-03-09
|
* Replace std::vector with our own implementation, as using the stl when ↵Gravatar Benoit Steiner2016-03-08
| | | | compiling with nvcc and avx enabled leads to many issues.
* Simplified the full reduction codeGravatar Benoit Steiner2016-03-08
|
* Decoupled the packet type definition from the definition of the tensor ops. ↵Gravatar Benoit Steiner2016-03-08
| | | | All the vectorization is now defined in the tensor evaluators. This will make it possible to relialably support devices with different packet types in the same compilation unit.
* Made the signature of the inner and outer reducers consistentGravatar Benoit Steiner2016-02-29
|
* Optimized the performance of narrow reductions on CUDA devicesGravatar Benoit Steiner2016-02-29
|
* Fixed a typo in the reduction code that could prevent large full reductionsx ↵Gravatar Benoit Steiner2016-02-24
| | | | from running properly on old cuda devices.
* Fixed a number of compilation warnings generated by the cuda testsGravatar Benoit Steiner2016-01-31
|
* Fixed a couple of compilation warnings.Gravatar Benoit Steiner2016-01-28
|
* Fixed some compilation problems with nvcc + clangGravatar Benoit Steiner2016-01-27
|
* Record whether the underlying tensor storage can be accessed directly during ↵Gravatar Benoit Steiner2016-01-19
| | | | the evaluation of an expression.
* Properly record the rank of reduced tensors in the tensor traits.Gravatar Benoit Steiner2016-01-13
|
* Merged in jeremy_barnes/eigen/shader-model-3.0 (pull request PR-152)Gravatar Benoit Steiner2016-01-11
|\ | | | | | | Alternative way of forcing instantiation of device kernels without causing warnings or requiring device to device kernel invocations.
* | Fixed a bug in the dispatch of optimized reduction kernels.Gravatar Benoit Steiner2016-01-11
| |
* | Re-enabled the optimized reduction CUDA code.Gravatar Benoit Steiner2016-01-11
| |
| * Alternative way of forcing instantiation of device kernels withoutGravatar Jeremy Barnes2016-01-10
|/ | | | | | causing warnings or requiring device to device kernel invocations. This allows Tensorflow to work on SM 3.0 (ie, Amazon EC2) machines.
* Simplified the dispatch code.Gravatar Benoit Steiner2016-01-08
|
* Reworked the dispatch of optimized cuda reduction kernels to workaround a ↵Gravatar Benoit Steiner2016-01-08
| | | | nvcc bug that prevented the code from compiling in optimized mode in some cases
* Improved the performance of reductions on CUDA devicesGravatar Benoit Steiner2016-01-04
|
* Optimized outer reduction on GPUs.Gravatar Benoit Steiner2015-12-22
|
* Silenced some compilation warnings triggered by nvccGravatar Benoit Steiner2015-12-17
|
* Simplified more of the IndexList code.Gravatar Benoit Steiner2015-11-12
|
* Started to make the IndexList code compile by more compilersGravatar Benoit Steiner2015-11-12
|
* Fixed CUDA compilation errorsGravatar Benoit Steiner2015-11-11
|
* Code cleanupGravatar Benoit Steiner2015-11-06
|
* Misc fixes to full reductionsGravatar Benoit Steiner2015-11-05
|
* Updated the reduction code so that full reductions now return a tensor of ↵Gravatar Benoit Steiner2015-11-04
| | | | rank 0.
* Many files were missing in previous changeset.Gravatar Gael Guennebaud2015-07-29
|
* Silenced a number of compilation warningsGravatar Benoit Steiner2015-06-29
|
* Improved performance of full reduction by 2 order of magnitude on CPU and 3 ↵Gravatar Benoit Steiner2015-06-29
| | | | orders of magnitude on GPU
* Worked around some constexpr related bugs in nvcc 7Gravatar Benoit Steiner2015-05-28
|
* Silenced a few compilation warnings generated by nvccGravatar Benoit Steiner2015-02-10
|
* Silcenced a few compilation warningsGravatar Benoit Steiner2015-02-10
|
* Added the EIGEN_HAS_CONSTEXPR defineGravatar Benoit Steiner2015-02-06
| | | | Gate the tensor index list code based on the value of EIGEN_HAS_CONSTEXPR
* Silenced some compilation warningsGravatar Benoit Steiner2015-01-30
|
* mproved the performance of tensor reductions that preserve the inner most ↵Gravatar Benoit Steiner2015-01-27
| | | | dimension(s).
* Improved the performance of tensor reductionsGravatar Benoit Steiner2015-01-14
| | | | | Added the ability to generate random numbers following a normal distribution Created a test to validate the ability to generate random numbers.
* Silenced a few compilation warningsGravatar Benoit Steiner2014-10-16
| | | | Generalized a TensorMap constructor
* Added support for tensor reductions and concatenationsGravatar Benoit Steiner2014-10-01