aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
* Added the ability to use a scratch buffer in cuda kernelsGravatar Benoit Steiner2016-05-09
|
* Added a new parallelFor api to the thread pool device.Gravatar Benoit Steiner2016-05-09
|
* Optimized the non blocking thread pool:Gravatar Benoit Steiner2016-05-09
| | | | | | | | | * Use a pseudo-random permutation of queue indices during random stealing. This ensures that all the queues are considered. * Directly pop from a non-empty queue when we are waiting for work, instead of first noticing that there is a non-empty queue and then doing another round of random stealing to re-discover the non-empty queue. * Steal only 1 task from a remote queue instead of half of tasks.
* Pulled latest updates from trunkGravatar Benoit Steiner2016-05-07
|\
* | Worked around a bug in nvcc on tegra x1Gravatar Benoit Steiner2016-05-07
| |
* | Merged latest updates from trunkGravatar Benoit Steiner2016-05-06
|\ \
* | | Added support for packet processing of fp16 on kepler and maxwell gpusGravatar Benoit Steiner2016-05-06
| | |
| * | Avoid double promotionGravatar Benoit Steiner2016-05-06
| | |
* | | Marked a few tensor operations as read onlyGravatar Benoit Steiner2016-05-05
|/ /
* | Added a test to validate full reduction on tensor of half floatsGravatar Benoit Steiner2016-05-05
| |
* | Made the testing of contractions on fp16 more robustGravatar Benoit Steiner2016-05-05
| |
* | Refined the testing of log and exp on fp16Gravatar Benoit Steiner2016-05-05
| |
* | Further improved the testing of fp16Gravatar Benoit Steiner2016-05-05
| |
* | Relaxed the dummy precision for fp16Gravatar Benoit Steiner2016-05-05
| |
* | Relaxed an assertion that was tighter that necessary.Gravatar Benoit Steiner2016-05-05
| |
* | Added a benchmark to measure the performance of full reductions of 16 bit floatsGravatar Benoit Steiner2016-05-05
| |
* | Fixed some incorrect assertionsGravatar Benoit Steiner2016-05-05
| |
* | Avoid unecessary type promotionGravatar Benoit Steiner2016-05-05
| |
* | Strongly hint but don't force the compiler to unroll a some loops in the ↵Gravatar Benoit Steiner2016-05-05
| | | | | | | | tensor executor. This results in up to 27% faster code.
* | Avoided unecessary type promotionGravatar Benoit Steiner2016-05-05
| |
* | Added tests for full contractions using thread pools and gpu devices.Gravatar Benoit Steiner2016-05-05
| | | | | | | | Fixed a couple of issues in the corresponding code.
* | Updated the contraction code to ensure that full contraction return a tensor ↵Gravatar Benoit Steiner2016-05-05
| | | | | | | | of rank 0
* | Fixed some singed/unsigned comparison warningsGravatar Christoph Hertzberg2016-05-05
| |
* | Enable and fix -Wdouble-conversion warningsGravatar Christoph Hertzberg2016-05-05
| |
* | Reduced the memory footprint of the cxx11_tensor_image_patch testGravatar Benoit Steiner2016-05-04
| |
* | Removed extraneous 'explicit' keywordsGravatar Benoit Steiner2016-05-04
| |
* | fix double-promotion/float-conversion in Core/SpecialFunctions.hGravatar Ola Røer Thorsen2016-05-04
| |
* | Improve documentation of BDCSVDGravatar Gael Guennebaud2016-05-04
| |
* | Use numext::isfinite instead of std::isfiniteGravatar Benoit Steiner2016-05-03
| |
* | bug #1214: consider denormals as zero in D&C SVD. This also workaround ↵Gravatar Gael Guennebaud2016-05-03
| | | | | | | | infinite binary search when compiling with ICC's unsafe optimizations.
* | Added a test to validate the computation of exp and log on 16bit floatsGravatar Benoit Steiner2016-05-03
| |
* | Fixed compilation error with cuda >= 7.5Gravatar Benoit Steiner2016-05-03
| |
* | Deleted superfluous explicit keyword.Gravatar Benoit Steiner2016-05-03
| |
* | Made a cast explicitGravatar Benoit Steiner2016-05-02
| |
* | Pulled latest updates from trunkGravatar Benoit Steiner2016-05-01
|\ \
* | | Fixed compilation errorGravatar Benoit Steiner2016-05-01
| | |
| * | Fix performance regression: with AVX, unaligned stores were emitted instead ↵Gravatar Gael Guennebaud2016-05-01
|/ / | | | | | | of aligned ones for fixed size assignement.
* | Added missing accessors to fixed sized tensorsGravatar Benoit Steiner2016-04-29
| |
* | Deleted trailing commasGravatar Benoit Steiner2016-04-29
| |
* | Deleted useless trailing commasGravatar Benoit Steiner2016-04-29
| |
* | Deleted unnecessary trailing commas.Gravatar Benoit Steiner2016-04-29
| |
* | Fixed compilation errors generated by clangGravatar Benoit Steiner2016-04-29
| |
* | Added a few tests to ensure that the dimensions of rank 0 tensors are ↵Gravatar Benoit Steiner2016-04-29
| | | | | | | | correctly computed
* | Return the proper size (ie 1) for tensors of rank 0Gravatar Benoit Steiner2016-04-29
| |
* | Made several tensor tests compatible with cxx03Gravatar Benoit Steiner2016-04-29
| |
* | Moved a number of tensor tests that don't require cxx11 to work properly ↵Gravatar Benoit Steiner2016-04-29
| | | | | | | | outside the EIGEN_TEST_CXX11 test section
* | Fixed teh cxx11_tensor_empty test to compile without requiring cxx11 supportGravatar Benoit Steiner2016-04-29
| |
* | Deleted unused default values for template parametersGravatar Benoit Steiner2016-04-29
| |
* | Made a coupe of tensor tests compile without requiring c++11 support.Gravatar Benoit Steiner2016-04-29
| |
* | Made the cxx11_tensor_forced_eval compile without c++11.Gravatar Benoit Steiner2016-04-29
| |