aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core/kernels/mkl_pooling_ops_common.cc
Commit message (Collapse)AuthorAge
* mkl_pooling_ops_common.cc: convert asserts to DCHECKs.Gravatar A. Unique TensorFlower2018-09-07
| | | | | | | | DCHECK is more idiomatic in the Tensorflow code base. Also, some of the "not-reached" asserts were actually inverted, asserting an always-true rather than an always-false expression. PiperOrigin-RevId: 211986533
* variable renaming per code review suggestionsGravatar Guozhong Zhuang2018-08-17
|
* enable pooling3D opGravatar Guozhong Zhuang2018-08-13
|
* Rename MKL-related feature macros.Gravatar A. Unique TensorFlower2018-08-10
| | | | | | | | | | | | | | | | | The existing feature macros are named INTEL_MKL to indicate that any flavor of MKL is available, INTEL_MKL_ML to indicate that *only* MKL-ML is available (i.e. MKL-DNN is not), and DO_NOT_USE_ML to indicate that *only* MKL-DNN is available (i.e. MKL-ML is not). This change renames INTEL_MKL_ML to INTEL_MKL_ML_ONLY and DO_NOT_USE_ML to INTEL_MKL_DNN_ONLY. The meanings of the macros have not changed. This change also adds a few sanity checks to mkl_util.h that ensures that the combination of INTEL_MKL, INTEL_MKL_ML_ONLY, and INTEL_MKL_DNN_ONLY is logically consistent: the *_ONLY macros may not both be defined, and if either of them is defined, bare INTEL_MKL must also be defined. PiperOrigin-RevId: 208313735
* Merge pull request #19403 from Intel-tensorflow:primreuse_poolingGravatar TensorFlower Gardener2018-08-06
|\ | | | | | | PiperOrigin-RevId: 207625392
| * code change based on PR review suggestions and style chekcGravatar Guozhong Zhuang2018-07-23
| |
| * Fix a typo during code refactoringGravatar Li, Yiqiang2018-07-11
| |
| * code refactoring per Rasmus's suggestions on PR 19754Gravatar Guozhong Zhuang2018-06-12
| |
| * enhancement with pooling ops primitive reuseGravatar Guozhong Zhuang2018-05-18
|/
* Making MKL-DNN default build choice (#16474)Gravatar AG Ramesh2018-01-26
|
* Branch 183429339 (#16469)Gravatar Rasmus Munk Larsen2018-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Change `reduce_logsumexp` to internally use `reshape` rather than `squeeze` since the latter requires the `axis` arg to be a Python `list`. PiperOrigin-RevId: 183396533 * Kernel utils to support broadcast add and mul. PiperOrigin-RevId: 183397494 * Updating sparsify_gather. PiperOrigin-RevId: 183402917 * [tf.data] Move slow-path-related code into the slow path in IteratorHandleOp::Compute(). This slightly reduces the amount of work performed when an iterator is accessed (after the first access), and potentially reduces contention if concurrent steps are accessing the same iterator. PiperOrigin-RevId: 183406221 * Cleanup: Ran clang-format on all *.{cc,h} in under grappler. PiperOrigin-RevId: 183406440 * Increase shard count of //third_party/tensorflow/python:nn_batchnorm_test to avoid timeouts When run under asan, the test runs for about 5 minutes, and sometimes longer, causing frequent timeouts. This change increases the shard count of the test to 4, which brings the run time of the longest running shard under asan to about 2 minutes. PiperOrigin-RevId: 183414888 * Add available choices to toco flags and fix minor formatting issues. PiperOrigin-RevId: 183415713 * Performance improvements to some GPU code to use shared locks instead of unique locks for some hotspot cases. PiperOrigin-RevId: 183418559 * [XLA] Improve error message for bad slices. PiperOrigin-RevId: 183420038 * Fix py3 build rules for all py tests under py2tf. PiperOrigin-RevId: 183422144 * Fix bug with Operation._control_inputs setter. PiperOrigin-RevId: 183422192 * Make softmax_op_test.py work with C API enabled. PiperOrigin-RevId: 183422829 * Cleanup: Ran clang-format on all *.{cc,h} files in tensorflow/core/kernels. PiperOrigin-RevId: 183423961 * Fix the documentation for the dense layer for how rank > 2 inputs are handled. PiperOrigin-RevId: 183425868 * Cleanup: Ran clang-format on all *.{cc,h} in tensorflow/core/ops. PiperOrigin-RevId: 183429339
* MKL: Adding MKL-DNN pooling ops (#14679)Gravatar Mahmoud Abuzaina2017-12-06
| | | | | | | | * Adding MKL-DNN pooling ops * Disabling LRN MKL path; forcing to Eigen * Using the real undef preprocessor command.
* Adding MKL op for reshape and several fixes to other ops (#9228)Gravatar Vivek Rane2017-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Bug fixes to max/avg pooling * Added fixes for unexpected behaviour of OP_REQUIRES * Fix to get common_runtime/function_test to pass with MKL. Removes eigen label from test function when compiled with mkl flag * MKL Reshape * MKL Reshape (by Amin) * Bug fixes to max/avg pooling * Added fixes for unexpected behaviour of OP_REQUIRES * Fix to get common_runtime/function_test to pass with MKL. Removes eigen label from test function when compiled with mkl flag * MKL Reshape (by Amin) * Fixed duplication caused during merge * Modified Mkl Layer and Op Registry * config for mkl * Addressed PR comments * Mkl ops ordered alphabetically in tensorflow/core/BUILD
* Branch 152232810 (#8988)Gravatar Rohan Jain2017-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Improve py_func error handling. Automatically translate some python errors into corresponding TF errors at runtime. Change: 152156821 * Update interaction with libpng so that we use the public API instead of knowledge of the internal libpng data structures. Change: 152167754 * TensorBoard plugins now contain their own name/route prefix. Change: 152167807 * Passes trainable flag to separable_conv2d biases. Change: 152170239 * Saving resource variables with a caching device. Change: 152171539 * Drop loss from estimator_spec.eval_metric_ops, as required by core Estimator. Change: 152179924 * sample_stats.percentile DOCFIX. Change: 152182295 * Added a memory optimizer to grappler. Change: 152184170 * Change default behavior of the tf runs selector: - If there are fewer than 41 runs, enable them all by default - If there are 41 runs or more, disable them all by default This is in response to user complaints that having it enable only the first ten runs by default was confusing, because it was not obvious to users that some runs had been disabled. However, it still solves the initial user complaint that having very many runs simultaneously enabled would lag the UI. I also changed the "toggle all runs" button to try to turn everything off before turning everything on. Also, I improved the logic for detecting when the runs selection is back in the default state, so that we can avoid generating long URI strings wherever possible. Change: 152188948 * Autogenerated Change: Change TensorBoard TAG to 52 Change: 152189000 * Remove warning that only happening with config cuda. Change: 152189205 * Make resource variable shared name consistent with non-resource variables. Remove colocation constraint from resource variable cached value with the variable itself. Change: 152192203 * Add a way to specify the optimization order; refactor and add constant folding to meta optimizer. Change: 152193646 * Backport fixes and improvements from external Keras. Change: 152198296 * Merge changes from github. Change: 152200430 * Go: Update generated wrapper functions for TensorFlow ops. Change: 152200754 * Update ops-related pbtxt files. Change: 152203174 * Make ImportGraphDef() work with functions. In addition to modify graph_constructor.cc, this patch adds some other functionality to enable importing fucntions: * Ability to add FunctionDefLibraries to Graphs and FunctionLibraryDefinitions (in addition to existing functions) * FunctionDefsEqual() utility function Change: 152205258 * Expand contrib test to more than just test targets. Change: 152206822 * Preserve graph version during optimization Change: 152213262 * Exclude enter and exit nodes from shape refiner's constant folding. Change: 152213637 * Allow reshape_mover and algebraic_simplifier to make multiple mutations, by avoiding the short-circuit std::any_of. Change: 152232810 * fixing workspace.bzl * workspace.bzl further fixes * fixing tensorflow.bzl merge conflicts * fixing typo in dnn.h * fixing bad merge for dnn.h
* MKL support for max/avg pooling and relu (#8296)Gravatar Vivek Rane2017-03-23
* Adding MKL support for Max/Avg Pooling and ReLU * Missed the mkl layer registry files * Fixed sanity check errors with buildifier * Adding MKL support for Max/Avg Pooling and ReLU * Missed the mkl layer registry files * Fixed sanity check errors with buildifier * Adding Intel Conv2D kernel implementation alongwith required Graph passes This commit contains 4 main components: 1) Intel-optimized kernel implementation for Conv2D op Implementation in kernels/mkl_conv_ops.* 2) Graph passes required to enable Conv2D optimized implementation Implementation in graph/mkl_*. We also need a new op, MklToTf op. Its implementation is in kernels/mkl_tfconv_op.cc. 3) Utility functions used in kernel implementation Implementation is in common_runtime/mkl_layer_registry* and util/mkl_util.h 4) BUILD changes for Conv2D, graph passes and utility functions * Refactor MKL convolution forward pass computation into smaller functions. Changed configure to point to newer MKLML library * Moved Mkl helper datastructures and routines to private class members * MKL op registration changed to use existing op registry (nhasabni) * Fixed buildifier error * Adding MKL support for Max/Avg Pooling and ReLU * Missed the mkl layer registry files * Fixed sanity check errors with buildifier * Removed the mkl layer registry (should not have been added) and made fixes according to the code review comments * Adding Intel Conv2D kernel implementation alongwith required Graph passes This commit contains 4 main components: 1) Intel-optimized kernel implementation for Conv2D op Implementation in kernels/mkl_conv_ops.* 2) Graph passes required to enable Conv2D optimized implementation Implementation in graph/mkl_*. We also need a new op, MklToTf op. Its implementation is in kernels/mkl_tfconv_op.cc. 3) Utility functions used in kernel implementation Implementation is in common_runtime/mkl_layer_registry* and util/mkl_util.h 4) BUILD changes for Conv2D, graph passes and utility functions * Refactor MKL convolution forward pass computation into smaller functions. Changed configure to point to newer MKLML library * Moved Mkl helper datastructures and routines to private class members * MKL op registration changed to use existing op registry (nhasabni) * Fixed buildifier error * Adding MKL support for Max/Avg Pooling and ReLU * Missed the mkl layer registry files * Fixed sanity check errors with buildifier * Removed the mkl layer registry (should not have been added) and made fixes according to the code review comments * Fixed rebase messups * Added documentation for mkl pooling op parameters * removed layer registry reference from mkl relu op