|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* OpenCL Improvements
* Registers Scatter and ScatterNd Ops for SYCL
* Registers Stack op for SYCL
* Fixes No sycl buffer found error for debug ops
* Registers MatMul and Transpose Ops to SYCL device for double
* Extends analyzer_cli_test.py test to cover SYCL
* Fixes Transpose Op for double when on SYCL
* Bumps Eigen version to fix double precision issue on SYCL
* Extends SessionDebugTestBase to cover SYCL
* Register SYCL implementations for random ops
* Avoid functions that might not be defined on SYCL device (#51)
* Avoid functions that might not be defined on SYCL device
* Simplify by using Eigen math functions
* OpenCL improvements
- Bumps Eigen Version
- Refactors Ops registration
- Introduces workaround for Const Op related to the difference between
CUDA which uses pointers and OpenCL that uses buffers/accessors
- Extends memory types to cover DEVICE_SYCL as well
- Introduces GetSYCLDevice() method that returns list of supported devices
with GPU device having the highest priority ( doesn't include blacklisted devices )
- ::internal::Transpose -> tensorflow::internal::Transpose in order to
avoid compilation reported error
- re-introduces fix for bugged string replacement causing a lot of compilation
warnings -c -> --include
- Adds sycl_runtime to bazels ARRAY_DEPS
- Replicates TF_CALL_GPU_PROXY_TYPES for SYCL
* [OpenCL] Fixes an issue caused by switch to aligned allocator for sycl buffer (#53)
* [Build] Use gcc/g++ as a host compiler to avoid https://github.com/tensorflow/tensorflow/issues/8394 (#54)
* [OpenCL] Fixes Scatter Op
* Fix testSimple and testConst in stack_op_test (#3)
* Fix testSimple and testConst in stack_op_test
* Create a specialisation of DoParallelConcatUpdate for SyclDevice and
register it
* Guard all code in TENSORFLOW_USE_SYCL
* Do not use sycl device for int32
* Registration of the Sycl version is now looking like the one for the GPU
* Remove added empty line
* Register batch normalization kernels for OpenCL (#61)
* [OpenCL] RandomGamma has no GPU friendly implementation (#57)
* [OpenCL] Compatibility fixes for TensorFlow 1.1.0-rc1
* [OpenCL] Implements BatchMatmul Op for SYCL
* Lowercase the device name when GPU or SYCL returned
* [OpenCL] kernel_estimator_test.py assertEqual-> assertAlmostEqual due to floating point representation on the device
* [Eigen] Version bump
* GPU device name string manipulation is not needed anymore
* [OpenCL] Adds SYCL to device backwards compatibility
* [OpenCL] Extends core_rnn_test.py to run for SYCL device
* [OpenCL] Minor optimizations for build script
* [OpenCL] Enables skip folder list in build script
* [OpenCL] Fixes ApplyAdamOp for Sycl device
* [OpenCL] SYCL device improvements
* [OpenCL] Fixes debug_ops's SEGFAULT for SYCL device
* [Build] Adds hexagon to skipped folders list
* [OpenCL] Removes EnterLameDuckMode from SYCL device and allocator
* [OpenCL] Registers Unique Op for SYCL device
* [OpenCL][Temporary] Disables tests for SYCL target due to features not being implemented yet
Tests affected:
- tensorflow/contrib/memory_stats/python/kernel_tests/memory_stats_ops_test.py
- tensorflow/contrib/rnn/python/kernel_tests/core_rnn_test.py
- tensorflow/python/kernel_tests/conv_ops_test.py
- tensorflow/python/kernel_tests/depthwise_conv_op_test.py
- tensorflow/python/kernel_tests/pooling_ops_3d_test.py
- tensorflow/python/kernel_tests/pooling_ops_test.py
- tensorflow/python/kernel_tests/scatter_nd_ops_test.py
- tensorflow/python/training/adam_test.py
- tensorflow/python/training/localhost_cluster_performance_test.py
- tensorflow/python/training/training_ops_test.py
* [OpenCL][Temporary] Disables failing tests for SYCL in order to establish regression baseline
Tests affected:
- tensorflow/python/debug/cli/analyzer_cli_test.py
- tensorflow/python/debug/lib/session_debug_testlib.py
- tensorflow/python/debug/lib/stepper_test.py
- tensorflow/python/kernel_tests/unstack_op_test.py
- tensorflow/python/ops/image_ops_test.py
* [OpenCL] Take options.config.device_count() into consideration
* [OpenCL] Fixes compilation warning
* [OpenCL] device:SYCL:0 -> sycl:0
* [OpenCL] Removes unwanted flags in building script
Removes flags given to computecpp that enable SIMD instructions
Removes duplicate flags
* bool -> const bool
* [OpenCL] sycl in test_util.gpu_device_name() -> is_sycl_enabled()
* [OpenCL][Temporary] Disables failing tests for SYCL in order to establish regression baseline
Test affected:
- tensorflow/contrib/stateless/python/kernel_tests/stateless_random_ops_test.py
* Imports test_util from tensorflow.python.framework
* [OpenCL] Fixes formatting in Python code
* [OpenCL] Extends session_test.py to cover SYCL device
* [OpenCL] Cleans singleton class
* [OpenCL] Keeping CUDA happy
* [OpenCL][Temporary] Disables failing tests for SYCL in order to establish regression baseline
Test affected:
- tensorflow/contrib/rnn/python/kernel_tests/core_rnn_cell_test.py
- tensorflow/contrib/seq2seq/python/kernel_tests/beam_search_ops_test.py
* Added support for building with SYCL on ARM.
* Acts on the review feedback from:
- https://github.com/tensorflow/tensorflow/pull/9117#discussion_r113608975
- https://github.com/tensorflow/tensorflow/pull/9117#discussion_r113609173
* [OpenCL] Fixes scatter_nd_op_test
* Fixes auto-merge mistake
* [OpenCL] struct SyclDevice -> class SyclDevice
* Revert "[OpenCL] struct SyclDevice -> class SyclDevice"
This reverts commit addd43348c374a5379f67bb1e5ad084715722fc2.
* [OpenCL] Reverting refactoring commit.
As requested in the review https://github.com/tensorflow/tensorflow/pull/9117#issuecomment-298454466
This change set will be re-introduced in smaller chunks.
* Revert "[OpenCL] device:SYCL:0 -> sycl:0"
This reverts commit cf16e60340b62d16c3764d71b716fe03d35f87a9.
* Revert "[OpenCL] Adds SYCL to device backwards compatibility"
This reverts commit b8401b5164199b7a169be1c1d8dea5001195c390.
* Acts on the feedback from https://github.com/tensorflow/tensorflow/pull/9117#discussion_r115036905
* control_flow_ops_py_test.py expects device name to be lower cased
* Acts on the feedback from https://github.com/tensorflow/tensorflow/pull/9117#discussion_r115037222
* Removes debug print
* Removes not needed partial specialisation
* [OpenCL] Registers ScatterNdFunctor for SYCL device
* [OpenCL] Make it compile
* [OpenCL] Follow gpu_device changes
* [OpenCL] Adds cxx_builtin_include_directory for python lib
Fixes bazels missing undeclared inclusions that appeared after
merge with TensorFlow upstream
* [OpenCL] Fixes Constant Op
* [OpenCL] gXX-4.8 -> gXX
* [OpenCL] Removes -D_GLIBCXX_USE_CXX11_ABI=0 as it breaks default compiler setup for Ubuntu 16.04
* Revert "[OpenCL] kernel_estimator_test.py assertEqual-> assertAlmostEqual due to floating point representation on the device"
This reverts commit 06c50c0a485f40c30a436f02c3fa7794e370c49d.
* [OpenCL] CPU allocator is a singleton we should not delete it
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* OpenCL improvements
Added Tile, Transpose and Range Ops double support for SYCL device.
Moved gpu_device_name() to test_util.py so now it can be used in force_gpu to pull either GPU or SYCL depending on what is available in the system.
* Improvements to the SYCL device support
- Registration of Type Traits required for stride slice op
- Registration of ConcatOffset, _ListToArray, _ArrayToList
Pad, Reverse ( CPU ), ReverseV2 ( CPU ), Size, ExpandDims,
Squeeze, StridedSlice, StridedSliceGrad, StridedSliceAssign,
TileGrad, InvertPermutation, Transpose
- Registration of Sycl kernels only for essential data types
- Floor_div_real has been disabled for SYCL device
- Device in control_flow_ops_py_test.py needed to be lower cased
* SYCL support improvements (#31)
* Improvements to the SYCL device support
This commit reduces number of failing tests when TensorFlow compiles
for OpenCL support.
- Registration of Type Traits required for stride slice op
- Registration of ConcatOffset, _ListToArray, _ArrayToList
Pad, Reverse ( CPU ), ReverseV2 ( CPU ), Size, ExpandDims,
Squeeze, StridedSlice, StridedSliceGrad, StridedSliceAssign,
TileGrad, InvertPermutation, Transpose
- Registration of Sycl kernels only for essential data types
- Floor_div_real has been disabled for SYCL device
- Device in control_flow_ops_py_test.py needed to be lower cased
* Fixes & Version bump (#33)
* Fix Unbuntu typo. (#38)
unbuntu -> ubuntu
* Add problem descriptions and solutions (#35)
* Add ComputeCpp lib folder to LD_LIBRARY_PATH
* Add ImportError problem + solution
If you get the error message "ImportError: libComputeCpp.so: cannot open shared
object file: No such file or directory", make sure you have added the
path to ComputeCpp's lib folder to your `LD_LIBRARY_PATH`.
* Add another ImportError problem + solution
If you get the error message "ImportError: cannot import name
'pywrap_tensorflow'" you may be standing in the TensorFlow directory.
* Improvements to the SYCL device support
* Registers FloorDiv, FloorMod and SoftMax Ops for SYCL device
* Workaround for 0 bytes allocation for SYCL device (#42)
* Sycl improvements (#44)
- Eigen version bump
- Extends Cast and Cwise ops benchmark to cover Sycl device
- Extends device_lib_test.py to cover Sycl device
- Registers int32, string and ResourceHandler to run on host for
Enter and RefEnter Sycl Ops
- Enables RecudeMax op for Sycl since Eigen implementation is ready
- Registers Less op for Sycl device
* Improved the formatting of the SYCL code
* Fixed compilation error.
* Made sure that using test sessions with force_gpu=True forces the
placement on a gpu device even if none is detected.
|
|
|
|
|
|
| |
* Registered Pack, Shape, Split, Unpack. Aggregate, ControlFlow, Session, Slice and Placeholder Ops.
* Avoid dividing by zero in python test.
* Passing -cl-denorms-are-zero to ComputeCpp as denormals need to be flushed to zero.
|