diff options
author | 2017-09-13 14:32:25 -0700 | |
---|---|---|
committer | 2017-09-13 14:36:05 -0700 | |
commit | d6f9d6109474a9162ef4d99520a2d4ef0becfb14 (patch) | |
tree | 697b9ef4b78b4ab570d7f1a074219844813cc9ed /tensorflow/core/kernels/reduction_ops_gpu_bool.cu.cc | |
parent | f445958edbca3ad292c9ed8c9de0c7e047b1d2bd (diff) |
Switch the softmax to use the new deterministic reductions on the GPU,
results in a speed up of 10-40x on the existing ImageNet benchmarks and 2-3x
on the newly added transformer benchmarks.
Update the benchmark to also run on the GPU.
Remove duplicate cpu tests.
PiperOrigin-RevId: 168596693
Diffstat (limited to 'tensorflow/core/kernels/reduction_ops_gpu_bool.cu.cc')
-rw-r--r-- | tensorflow/core/kernels/reduction_ops_gpu_bool.cu.cc | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/tensorflow/core/kernels/reduction_ops_gpu_bool.cu.cc b/tensorflow/core/kernels/reduction_ops_gpu_bool.cu.cc index 3e7a33ba3f..79ec1d59df 100644 --- a/tensorflow/core/kernels/reduction_ops_gpu_bool.cu.cc +++ b/tensorflow/core/kernels/reduction_ops_gpu_bool.cu.cc @@ -17,7 +17,7 @@ limitations under the License. #define EIGEN_USE_GPU -#include "tensorflow/core/kernels/reduction_ops_gpu_kernels.h" +#include "tensorflow/core/kernels/reduction_gpu_kernels.cu.h" namespace tensorflow { namespace functor { |