Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Simplify Eigen package config (#3288) | Igor Babuschkin | 2016-07-19 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Simplify Eigen package config * Add missing unsupported/Eigen/* * Fix pip setup.py * Adjust new eigen header * Fix bazel include dependency error * Adjust Makefile to work with Eigen changes * Remove nvcc workaround for CUDA <= 6.0 CUDA versions prior to 6.5 gave an error: kernel launches from templates are not allowed in system files error when using gcc v4.8 and including code that uses templated kernel launches via `-isystem`. In order to work around this, the GPU crosstool converted `-isystem` arguments containing the cuda headers into `-iquote` arguments. This workaround has now been removed. * Configure cmake and make to get eigen version from tensorflow/workspace.bzl | ||
* | Switched to the latest version of Eigen that provides significant performance | Benoit Steiner | 2016-07-12 |
| | | | | | | improvements for fp16 Added SpecialFunctions to the list of eigen headers TensorFlow depends on Change: 127264575 | ||
* | Automated rollback of change 127233960 | Vijay Vasudevan | 2016-07-12 |
| | | | | Change: 127253427 | ||
* | Switched to the latest version of Eigen that provides significant performance | Benoit Steiner | 2016-07-12 |
| | | | | | improvements for fp16 Change: 127233960 | ||
* | Adds a "currentThreadIndex" method to Eigen's ThreadPoolDevice. Use it to ↵ | A. Unique TensorFlower | 2016-06-27 |
| | | | | | | handle per-thread buffer allocation for the tileable executor without resorting to thread_local that is not fully supported on Android. Change: 126009029 | ||
* | Upgraded Eigen to the latest version that provides new scan operations. This ↵ | Benoit Steiner | 2016-06-23 |
| | | | | | | will enable the implementation of the cumsum operation in TensorFlow Change: 125697517 | ||
* | Enable the vectorization of adds and mult on fp16s. This improves the | Benoit Steiner | 2016-06-08 |
| | | | | | performance of the toy mnist training by 1 order of magnitude Change: 124374286 | ||
* | Improved the performance of full reductions on GPU. | Benoit Steiner | 2016-06-07 |
| | | | | | | | | | | | | | | | | | NEW BM_fullReduction/10 4591 4595 153149 20.8M items/s BM_fullReduction/64 5073 5075 100000 770.0M items/s BM_fullReduction/512 9067 9070 75263 26.9G items/s BM_fullReduction/4k 243984 244125 2868 64.0G items/s BM_fullReduction/5k 359125 359273 1951 64.8G items/s OLD BM_fullReduction/10 9085 9087 74395 10.5M items/s BM_fullReduction/64 9478 9478 72014 412.1M items/s BM_fullReduction/512 14643 14646 46902 16.7G items/s BM_fullReduction/4k 260338 260384 2678 60.0G items/s BM_fullReduction/5k 385076 385178 1818 60.5G items/s Change: 124290852 | ||
* | Enable fp16 for most of the pooling ops (MaxPool, AvgPool, associated | Benoit Steiner | 2016-06-06 |
| | | | | | gradients, some variants etc.). Change: 124197406 | ||
* | Enable fp16 for most of the pooling ops (MaxPool, AvgPool, associated | A. Unique TensorFlower | 2016-06-03 |
| | | | | | gradients, some variants etc.). Change: 123967787 | ||
* | Enable fp16 for most of the pooling ops (MaxPool, AvgPool, associated | Benoit Steiner | 2016-06-03 |
| | | | | | gradients, some variants etc.). Change: 123967117 | ||
* | Added support for convolutions of 16bit floats on CPU | Benoit Steiner | 2016-05-31 |
| | | | | Change: 123659102 | ||
* | Upgraded to the latest version of Eigen that supports convolutions on fp16 | Benoit Steiner | 2016-05-25 |
| | | | | Change: 123238579 | ||
* | Switched to the latest version of Eigen that performs much better on machines | Benoit Steiner | 2016-05-16 |
| | | | | | | | | with many cpu cores For example, the wall time for the following tutorial went down from 13m35 to 5m27: bazel run -c opt --copt=-mavx tensorflow/examples/tutorials/word2vec/word2vec_basic Change: 122462177 | ||
* | Made the contraction code compatible with fp16 | Benoit Steiner | 2016-05-12 |
| | | | | Change: 122192081 | ||
* | Upgraded to the latest version of Eigen that speeds up full reductions on fp16 | Benoit Steiner | 2016-05-12 |
| | | | | | by about 3 orders of magnitude as well as some partial reductions by 30% when using cuda 7.5 or above Change: 122191448 | ||
* | Improved support for min and max on 16 bit floats when running on recent cuda | Benoit Steiner | 2016-04-27 |
| | | | | | | gpus Updated the check numerics code to make it compatible with fp16 Change: 120980302 | ||
* | Made it possible to compute the cross entropy using 16 bit floats | Benoit Steiner | 2016-04-25 |
| | | | | Change: 120739269 | ||
* | Rollback of rollback of cl/120366069: | A. Unique TensorFlower | 2016-04-21 |
| | | | | | | | | tensorflow: switch to eigen thread pool This is first step of switching tensorflow to the new non-blocking thread pool in eigen. Change: 120510292 | ||
* | Prevent TensorFlow from crashing when attempting to reduce an empty tensor ↵ | Benoit Steiner | 2016-04-21 |
| | | | | | | on GPU Change: 120505517 | ||
* | Fixed a compilation error when targeting cuda 3.0 devices such as the ones | Benoit Steiner | 2016-04-20 |
| | | | | | offered by AWS Change: 120369420 | ||
* | Upgraded to the latest version of Eigen that adds support for computing the | Benoit Steiner | 2016-04-14 |
| | | | | | sigmoid of fp16 and introduces a condition estimator. Change: 119907721 | ||
* | Added support for trigonometric and transcendental functions of half floats | Benoit Steiner | 2016-04-14 |
| | | | | Change: 119850987 | ||
* | Upgraded to the latest version of Eigen that provides significant performance | Benoit Steiner | 2016-04-13 |
| | | | | | improvements for fp16 Change: 119771118 | ||
* | Made isinf, isnan, isfinite, ceil and floor work with 16 bit floats. | Benoit Steiner | 2016-04-09 |
| | | | | Change: 119458778 | ||
* | Upgraded to the latest version of Eigen that has bug fixes for complex numbers | Benoit Steiner | 2016-04-08 |
| | | | | | as well as fp16 Change: 119398881 | ||
* | Upgraded to the latest version of eigen that introduces implementations of ↵ | Benoit Steiner | 2016-04-07 |
| | | | | | | | the zeta and polygamma functions, as well as improved support for float16. Change: 119279101 | ||
* | Upgrade to the latest version of Eigen that provides better support for float16 | Benoit Steiner | 2016-04-04 |
| | | | | | and fixes the computation of absolute values on gpu. Change: 119001808 | ||
* | Upgraded to the latest version of Eigen | Benoit Steiner | 2016-03-28 |
| | | | | Change: 118414762 | ||
* | Upgraded to the latest version of Eigen that provides better support for fp16. | Benoit Steiner | 2016-03-28 |
| | | | | | Use Eigen mod functors directly instead of duplicating them. Change: 118362359 | ||
* | Move the NeuralNetwork code out of third_party/eigen3 and into | Benoit Steiner | 2016-03-23 |
| | | | | | tensorflow/core/kernel. Change: 117941211 | ||
* | TensorFlow: update eigen to latest change to fix TensorChipping | Vijay Vasudevan | 2016-03-18 |
| | | | | Change: 117570343 | ||
* | TensorFlow: update eigen to latest release that has a fix to too large frame. | Vijay Vasudevan | 2016-03-18 |
| | | | | Change: 117506296 | ||
* | Added basic support for float16 on CPUs and older GPUs. | Benoit Steiner | 2016-03-18 |
| | | | | | Also fixed compilation issues with cuda devices that support the compute model 5.3 Change: 117493644 | ||
* | Upgraded to a newer version of Eigen that fixes a compilation error on Android | Benoit Steiner | 2016-03-09 |
| | | | | Change: 116831720 | ||
* | Upgraded eigen to make it possible to compile a binary that takes advantage of | Benoit Steiner | 2016-03-09 |
| | | | | | both avx instructions and cuda to run as fast as possible. Change: 116775924 | ||
* | Upgraded to a new version of Eigen that adds the ability to pad using values | Benoit Steiner | 2016-03-08 |
| | | | | | other than 0 and significantly speeds up a number of computations on GPUs. Change: 116607765 | ||
* | Added the ability to convert between floats and float16 on Kepler and Maxwell | Benoit Steiner | 2016-03-05 |
| | | | | | GPUs Change: 116409601 | ||
* | Improved the performance of outer reductions | Benoit Steiner | 2016-03-01 |
| | | | | Change: 116063261 | ||
* | Improved the performance of narrow reductions on CUDA | Benoit Steiner | 2016-02-29 |
| | | | | Change: 115889721 | ||
* | Upgraded Eigen to fix a compilation error triggered by xcode | Benoit Steiner | 2016-02-22 |
| | | | | Change: 115280348 | ||
* | Upgraded to the latest version of eigen, which adds a missing #include | Benoit Steiner | 2016-02-22 |
| | | | | Change: 115268843 | ||
* | Added support for half floats to eigen, which is the first step to support half | Benoit Steiner | 2016-02-22 |
| | | | | | floats in TensorFlow. The code was tested on Tegra x1. Change: 115253733 | ||
* | TensorFlow: update Eigen to non-broken GPU commit. | Vijay Vasudevan | 2016-02-12 |
| | | | | Change: 114585944 | ||
* | Switch the slow path in matrix_solve_ls to using ↵ | A. Unique TensorFlower | 2016-02-11 |
| | | | | | | | Eigen::CompleteOrthogonalDecomposition (COD), which I recently contributed to Eigen in https://bitbucket.org/eigen/eigen/pull-requests/163/implement-complete-orthogonal/diff The advantage of COD over column pivoted QR is that it is able to compute the minimum-norm solution when the matrix is rank-deficient, which is usually the desired behavior and makes it consistent with the fast path. Change: 114483303 | ||
* | TensorFlow: change int -> size_t to silence warning | Vijay Vasudevan | 2016-02-09 |
| | | | | Change: 114243879 | ||
* | Replay of PR #969 for eigen3 | A. Unique TensorFlower | 2016-02-05 |
| | | | | Change: 113963788 | ||
* | Upgraded to the latest version of Eigen that brings misc performance and | Benoit Steiner | 2016-02-04 |
| | | | | | stability improvements Change: 113791782 | ||
* | Backported several changes from the upstream version of Eigen | Benoit Steiner | 2016-01-29 |
| | | | | Change: 113371678 | ||
* | Fix OSS builds. | Vijay Vasudevan | 2016-01-26 |
| | | | | Change: 113114631 |