diff options
author | 2017-02-21 11:00:19 -0800 | |
---|---|---|
committer | 2017-02-21 11:00:19 -0800 | |
commit | 2c8d0dca978a246f54c506aae4587dbce5d3bcf0 (patch) | |
tree | 9efcc4097cce2224d5cd0bb83698d52d5a5a5819 /tensorflow/core/kernels/cwise_op_maximum.cc | |
parent | 43c71a03380d8de18202cc399563814b2f438cd2 (diff) |
OpenCL Improvements (#7596)
* OpenCL improvements
Added Tile, Transpose and Range Ops double support for SYCL device.
Moved gpu_device_name() to test_util.py so now it can be used in force_gpu to pull either GPU or SYCL depending on what is available in the system.
* Improvements to the SYCL device support
- Registration of Type Traits required for stride slice op
- Registration of ConcatOffset, _ListToArray, _ArrayToList
Pad, Reverse ( CPU ), ReverseV2 ( CPU ), Size, ExpandDims,
Squeeze, StridedSlice, StridedSliceGrad, StridedSliceAssign,
TileGrad, InvertPermutation, Transpose
- Registration of Sycl kernels only for essential data types
- Floor_div_real has been disabled for SYCL device
- Device in control_flow_ops_py_test.py needed to be lower cased
* SYCL support improvements (#31)
* Improvements to the SYCL device support
This commit reduces number of failing tests when TensorFlow compiles
for OpenCL support.
- Registration of Type Traits required for stride slice op
- Registration of ConcatOffset, _ListToArray, _ArrayToList
Pad, Reverse ( CPU ), ReverseV2 ( CPU ), Size, ExpandDims,
Squeeze, StridedSlice, StridedSliceGrad, StridedSliceAssign,
TileGrad, InvertPermutation, Transpose
- Registration of Sycl kernels only for essential data types
- Floor_div_real has been disabled for SYCL device
- Device in control_flow_ops_py_test.py needed to be lower cased
* Fixes & Version bump (#33)
* Fix Unbuntu typo. (#38)
unbuntu -> ubuntu
* Add problem descriptions and solutions (#35)
* Add ComputeCpp lib folder to LD_LIBRARY_PATH
* Add ImportError problem + solution
If you get the error message "ImportError: libComputeCpp.so: cannot open shared
object file: No such file or directory", make sure you have added the
path to ComputeCpp's lib folder to your `LD_LIBRARY_PATH`.
* Add another ImportError problem + solution
If you get the error message "ImportError: cannot import name
'pywrap_tensorflow'" you may be standing in the TensorFlow directory.
* Improvements to the SYCL device support
* Registers FloorDiv, FloorMod and SoftMax Ops for SYCL device
* Workaround for 0 bytes allocation for SYCL device (#42)
* Sycl improvements (#44)
- Eigen version bump
- Extends Cast and Cwise ops benchmark to cover Sycl device
- Extends device_lib_test.py to cover Sycl device
- Registers int32, string and ResourceHandler to run on host for
Enter and RefEnter Sycl Ops
- Enables RecudeMax op for Sycl since Eigen implementation is ready
- Registers Less op for Sycl device
* Improved the formatting of the SYCL code
* Fixed compilation error.
* Made sure that using test sessions with force_gpu=True forces the
placement on a gpu device even if none is detected.
Diffstat (limited to 'tensorflow/core/kernels/cwise_op_maximum.cc')
-rw-r--r-- | tensorflow/core/kernels/cwise_op_maximum.cc | 15 |
1 files changed, 15 insertions, 0 deletions
diff --git a/tensorflow/core/kernels/cwise_op_maximum.cc b/tensorflow/core/kernels/cwise_op_maximum.cc index f93b5a8303..7311f25ec0 100644 --- a/tensorflow/core/kernels/cwise_op_maximum.cc +++ b/tensorflow/core/kernels/cwise_op_maximum.cc @@ -34,4 +34,19 @@ REGISTER_KERNEL_BUILDER(Name("Maximum") BinaryOp<CPUDevice, functor::maximum<int32>>); #endif +#ifdef TENSORFLOW_USE_SYCL +REGISTER(BinaryOp, SYCL, "Maximum", functor::maximum, float); + +// A special GPU kernel for int32. +// TODO(b/25387198): Also enable int32 in device memory. This kernel +// registration requires all int32 inputs and outputs to be in host memory. +REGISTER_KERNEL_BUILDER(Name("Maximum") + .Device(DEVICE_SYCL) + .HostMemory("x") + .HostMemory("y") + .HostMemory("z") + .TypeConstraint<int32>("T"), + BinaryOp<CPUDevice, functor::maximum<int32>>); +#endif // TENSORFLOW_USE_SYCL + } // namespace tensorflow |