aboutsummaryrefslogtreecommitdiffhomepage
path: root/third_party/sycl/crosstool/computecpp.tpl
Commit message (Collapse)AuthorAge
* Add -Wno-c++11-narrowing to ComputeCpp device compiler flags to avoid build ↵Gravatar Luke Iwanski2017-07-03
| | | | errors on 32-bit targets. (#109) (#11244)
* [Build] Reduces build time for SYCL target (#10996)Gravatar Luke Iwanski2017-06-23
| | | | | | | | | | | | | | | | | | * [Build] Removes tensorboard from skip list since it has been removed * Improve SYCL config for Tensorflow (#98) Only libtensorflow attempts to link against libComputeCpp.so now, other targets should only utilise the headers. Added more if_sycl statements to BUILD files. * [Build] Fixes typos & linkstatic=1 -> linkstatic=0 * [Build] Cleans SYCL dependency, linkstatic = 0 * [Build] No need to link syclrt tensorflow/BUILD * sycl_runtime is needed in kernels/BUILD due to inclusion
* [OpenCL] Implementation improvements (#9117)Gravatar Luke Iwanski2017-05-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * OpenCL Improvements * Registers Scatter and ScatterNd Ops for SYCL * Registers Stack op for SYCL * Fixes No sycl buffer found error for debug ops * Registers MatMul and Transpose Ops to SYCL device for double * Extends analyzer_cli_test.py test to cover SYCL * Fixes Transpose Op for double when on SYCL * Bumps Eigen version to fix double precision issue on SYCL * Extends SessionDebugTestBase to cover SYCL * Register SYCL implementations for random ops * Avoid functions that might not be defined on SYCL device (#51) * Avoid functions that might not be defined on SYCL device * Simplify by using Eigen math functions * OpenCL improvements - Bumps Eigen Version - Refactors Ops registration - Introduces workaround for Const Op related to the difference between CUDA which uses pointers and OpenCL that uses buffers/accessors - Extends memory types to cover DEVICE_SYCL as well - Introduces GetSYCLDevice() method that returns list of supported devices with GPU device having the highest priority ( doesn't include blacklisted devices ) - ::internal::Transpose -> tensorflow::internal::Transpose in order to avoid compilation reported error - re-introduces fix for bugged string replacement causing a lot of compilation warnings -c -> --include - Adds sycl_runtime to bazels ARRAY_DEPS - Replicates TF_CALL_GPU_PROXY_TYPES for SYCL * [OpenCL] Fixes an issue caused by switch to aligned allocator for sycl buffer (#53) * [Build] Use gcc/g++ as a host compiler to avoid https://github.com/tensorflow/tensorflow/issues/8394 (#54) * [OpenCL] Fixes Scatter Op * Fix testSimple and testConst in stack_op_test (#3) * Fix testSimple and testConst in stack_op_test * Create a specialisation of DoParallelConcatUpdate for SyclDevice and register it * Guard all code in TENSORFLOW_USE_SYCL * Do not use sycl device for int32 * Registration of the Sycl version is now looking like the one for the GPU * Remove added empty line * Register batch normalization kernels for OpenCL (#61) * [OpenCL] RandomGamma has no GPU friendly implementation (#57) * [OpenCL] Compatibility fixes for TensorFlow 1.1.0-rc1 * [OpenCL] Implements BatchMatmul Op for SYCL * Lowercase the device name when GPU or SYCL returned * [OpenCL] kernel_estimator_test.py assertEqual-> assertAlmostEqual due to floating point representation on the device * [Eigen] Version bump * GPU device name string manipulation is not needed anymore * [OpenCL] Adds SYCL to device backwards compatibility * [OpenCL] Extends core_rnn_test.py to run for SYCL device * [OpenCL] Minor optimizations for build script * [OpenCL] Enables skip folder list in build script * [OpenCL] Fixes ApplyAdamOp for Sycl device * [OpenCL] SYCL device improvements * [OpenCL] Fixes debug_ops's SEGFAULT for SYCL device * [Build] Adds hexagon to skipped folders list * [OpenCL] Removes EnterLameDuckMode from SYCL device and allocator * [OpenCL] Registers Unique Op for SYCL device * [OpenCL][Temporary] Disables tests for SYCL target due to features not being implemented yet Tests affected: - tensorflow/contrib/memory_stats/python/kernel_tests/memory_stats_ops_test.py - tensorflow/contrib/rnn/python/kernel_tests/core_rnn_test.py - tensorflow/python/kernel_tests/conv_ops_test.py - tensorflow/python/kernel_tests/depthwise_conv_op_test.py - tensorflow/python/kernel_tests/pooling_ops_3d_test.py - tensorflow/python/kernel_tests/pooling_ops_test.py - tensorflow/python/kernel_tests/scatter_nd_ops_test.py - tensorflow/python/training/adam_test.py - tensorflow/python/training/localhost_cluster_performance_test.py - tensorflow/python/training/training_ops_test.py * [OpenCL][Temporary] Disables failing tests for SYCL in order to establish regression baseline Tests affected: - tensorflow/python/debug/cli/analyzer_cli_test.py - tensorflow/python/debug/lib/session_debug_testlib.py - tensorflow/python/debug/lib/stepper_test.py - tensorflow/python/kernel_tests/unstack_op_test.py - tensorflow/python/ops/image_ops_test.py * [OpenCL] Take options.config.device_count() into consideration * [OpenCL] Fixes compilation warning * [OpenCL] device:SYCL:0 -> sycl:0 * [OpenCL] Removes unwanted flags in building script Removes flags given to computecpp that enable SIMD instructions Removes duplicate flags * bool -> const bool * [OpenCL] sycl in test_util.gpu_device_name() -> is_sycl_enabled() * [OpenCL][Temporary] Disables failing tests for SYCL in order to establish regression baseline Test affected: - tensorflow/contrib/stateless/python/kernel_tests/stateless_random_ops_test.py * Imports test_util from tensorflow.python.framework * [OpenCL] Fixes formatting in Python code * [OpenCL] Extends session_test.py to cover SYCL device * [OpenCL] Cleans singleton class * [OpenCL] Keeping CUDA happy * [OpenCL][Temporary] Disables failing tests for SYCL in order to establish regression baseline Test affected: - tensorflow/contrib/rnn/python/kernel_tests/core_rnn_cell_test.py - tensorflow/contrib/seq2seq/python/kernel_tests/beam_search_ops_test.py * Added support for building with SYCL on ARM. * Acts on the review feedback from: - https://github.com/tensorflow/tensorflow/pull/9117#discussion_r113608975 - https://github.com/tensorflow/tensorflow/pull/9117#discussion_r113609173 * [OpenCL] Fixes scatter_nd_op_test * Fixes auto-merge mistake * [OpenCL] struct SyclDevice -> class SyclDevice * Revert "[OpenCL] struct SyclDevice -> class SyclDevice" This reverts commit addd43348c374a5379f67bb1e5ad084715722fc2. * [OpenCL] Reverting refactoring commit. As requested in the review https://github.com/tensorflow/tensorflow/pull/9117#issuecomment-298454466 This change set will be re-introduced in smaller chunks. * Revert "[OpenCL] device:SYCL:0 -> sycl:0" This reverts commit cf16e60340b62d16c3764d71b716fe03d35f87a9. * Revert "[OpenCL] Adds SYCL to device backwards compatibility" This reverts commit b8401b5164199b7a169be1c1d8dea5001195c390. * Acts on the feedback from https://github.com/tensorflow/tensorflow/pull/9117#discussion_r115036905 * control_flow_ops_py_test.py expects device name to be lower cased * Acts on the feedback from https://github.com/tensorflow/tensorflow/pull/9117#discussion_r115037222 * Removes debug print * Removes not needed partial specialisation * [OpenCL] Registers ScatterNdFunctor for SYCL device * [OpenCL] Make it compile * [OpenCL] Follow gpu_device changes * [OpenCL] Adds cxx_builtin_include_directory for python lib Fixes bazels missing undeclared inclusions that appeared after merge with TensorFlow upstream * [OpenCL] Fixes Constant Op * [OpenCL] gXX-4.8 -> gXX * [OpenCL] Removes -D_GLIBCXX_USE_CXX11_ABI=0 as it breaks default compiler setup for Ubuntu 16.04 * Revert "[OpenCL] kernel_estimator_test.py assertEqual-> assertAlmostEqual due to floating point representation on the device" This reverts commit 06c50c0a485f40c30a436f02c3fa7794e370c49d. * [OpenCL] CPU allocator is a singleton we should not delete it
* Update computecpp.tpl (#8578)Gravatar Patrick2017-03-21
| | | Computecpp can't generate code for AVX function __builtin_ia32_cmpps256.
* OpenCL Improvements (#7596)Gravatar Benoit Steiner2017-02-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * OpenCL improvements Added Tile, Transpose and Range Ops double support for SYCL device. Moved gpu_device_name() to test_util.py so now it can be used in force_gpu to pull either GPU or SYCL depending on what is available in the system. * Improvements to the SYCL device support - Registration of Type Traits required for stride slice op - Registration of ConcatOffset, _ListToArray, _ArrayToList Pad, Reverse ( CPU ), ReverseV2 ( CPU ), Size, ExpandDims, Squeeze, StridedSlice, StridedSliceGrad, StridedSliceAssign, TileGrad, InvertPermutation, Transpose - Registration of Sycl kernels only for essential data types - Floor_div_real has been disabled for SYCL device - Device in control_flow_ops_py_test.py needed to be lower cased * SYCL support improvements (#31) * Improvements to the SYCL device support This commit reduces number of failing tests when TensorFlow compiles for OpenCL support. - Registration of Type Traits required for stride slice op - Registration of ConcatOffset, _ListToArray, _ArrayToList Pad, Reverse ( CPU ), ReverseV2 ( CPU ), Size, ExpandDims, Squeeze, StridedSlice, StridedSliceGrad, StridedSliceAssign, TileGrad, InvertPermutation, Transpose - Registration of Sycl kernels only for essential data types - Floor_div_real has been disabled for SYCL device - Device in control_flow_ops_py_test.py needed to be lower cased * Fixes & Version bump (#33) * Fix Unbuntu typo. (#38) unbuntu -> ubuntu * Add problem descriptions and solutions (#35) * Add ComputeCpp lib folder to LD_LIBRARY_PATH * Add ImportError problem + solution If you get the error message "ImportError: libComputeCpp.so: cannot open shared object file: No such file or directory", make sure you have added the path to ComputeCpp's lib folder to your `LD_LIBRARY_PATH`. * Add another ImportError problem + solution If you get the error message "ImportError: cannot import name 'pywrap_tensorflow'" you may be standing in the TensorFlow directory. * Improvements to the SYCL device support * Registers FloorDiv, FloorMod and SoftMax Ops for SYCL device * Workaround for 0 bytes allocation for SYCL device (#42) * Sycl improvements (#44) - Eigen version bump - Extends Cast and Cwise ops benchmark to cover Sycl device - Extends device_lib_test.py to cover Sycl device - Registers int32, string and ResourceHandler to run on host for Enter and RefEnter Sycl Ops - Enables RecudeMax op for Sycl since Eigen implementation is ready - Registers Less op for Sycl device * Improved the formatting of the SYCL code * Fixed compilation error. * Made sure that using test sessions with force_gpu=True forces the placement on a gpu device even if none is detected.
* Python : simplify computecpp.tplGravatar Joan Thibault2017-02-10
|
* Added missing BUILD dummy file to third_party/sycl/crosstool/BUILD (#22)Gravatar Luke Iwanski2016-12-19
| | | | | | | | * Added missing BUILD dummy file to third_party/sycl/crosstool/BUILD * Use floor_div_real for SYCL device. * Cleaned up SYCL crosstool.
* Sycl improvement. Gravatar Luke Iwanski2016-12-16
| | | | | | * Registered Pack, Shape, Split, Unpack. Aggregate, ControlFlow, Session, Slice and Placeholder Ops. * Avoid dividing by zero in python test. * Passing -cl-denorms-are-zero to ComputeCpp as denormals need to be flushed to zero.
* Added '-Wl,-no-as-needed' for linking *.so's, when CROSSTOOL is used.Gravatar Luke Iwanski2016-12-15
|
* SYCL improvementsGravatar Luke Iwanski2016-12-04
|
* Added sycl_asan target to the bazelGravatar Luke Iwanski2016-11-23
|
* Added OpenCL support for more operationsGravatar Luke Iwanski2016-11-21
|
* Registered More Ops for SYCL deviceGravatar Luke Iwanski2016-11-15
|
* Added -sycl-compress-name flag for ComputeCpp to avoid very long kernel names.Gravatar luke2016-11-15
|
* Update to SYCL bazel toolchain.Gravatar luke iwanski2016-10-17