diff options
author | Jonathan Hseu <jhseu@google.com> | 2017-08-25 14:01:05 -0700 |
---|---|---|
committer | TensorFlower Gardener <gardener@tensorflow.org> | 2017-08-25 14:04:48 -0700 |
commit | 008910f1122d115a6d7430bfcc63cf4296c7467d (patch) | |
tree | e50199dcceed004cecc8510f9251f5e04734800f /tensorflow/contrib/reduce_slice_ops | |
parent | 005a88f6cc6e4e8c94a4f2d1980737855c4592f4 (diff) |
Merge changes from github.
END_PUBLIC
---
Commit b30ce4714 authored by James Qin<jamesqin@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Revamp CudnnRNN Saveables
1. Use a lossy way to save/restore cudnn biases during checkpointing.
Cudnn uses 2 biases each gate for all RNNs while tf uses one. To allow cudnn checkpoints
to be compatible with both Cudnn and platform-independent impls, previously both
individual bias and summed biases each gate were stored.
The new way only stores the bias sum for each gate, and split it half-half when
restoring from a cudnn graph. Doing this does not cause problems since RNNs do not use
weight-decay to regularize.
2. Use inheritance instead of branching
* Split RNNParamsSaveable to 1 base class and 4 subclasses.
* Extract common routines and only overwrite rnn-type-specific pieces in subclasses.
PiperOrigin-RevId: 166413989
---
Commit ebc421daf authored by Alan Yee<alyee@ucsd.edu>
Committed by Jonathan Hseu<vomjom@vomjom.net>:
Update documentation for contrib (#12424)
* Update __init__.py
Remove ## for standardization of api docs
* Create README.md
Add README to define this directory's purpose
* Update __init.py
Markdown styling does not show up well in api docs
* Update README.md
Add short mention of describing what to deprecate
* Update README.md
Capitalize title
* Update README.md
Revert README change
* Delete README.md
---
Commit fd295394d authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Use latest version of nsync library, which now allows use of cmake on MacOS.
PiperOrigin-RevId: 166411437
---
Commit 587d728e0 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Refactor reduce-precision-insertion filters, add several more options.
In particular, this adds the ability to add reduce-precision operations after fusion nodes based on the contents of those fusion nodes, and the ability to filter operations based on the "op_name" metadata.
PiperOrigin-RevId: 166408392
---
Commit 3142f8ef5 authored by Ali Yahya<alive@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Steps toward making ResourceVariables compatible with Eager.
This change forces the value of the reuse flag in variable scopes to be tf.AUTO_REUSE when in Eager mode.
This change also adds comprehensive Eager tests for ResourceVariable.
PiperOrigin-RevId: 166408161
---
Commit b2ce45150 authored by Igor Ganichev<iga@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Make Graph::IsValidNode public
It can be reimplemented with existing public APIs, but instead of doing so,
making this one public seems better.
PiperOrigin-RevId: 166407897
---
Commit 0a2f40e92 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA::CPU] Fix HLO profiling in parallel CPU backend.
PiperOrigin-RevId: 166400211
---
Commit c4a58e3fd authored by Yao Zhang<yaozhang@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Identify frame ids for all nodes in a graph.
PiperOrigin-RevId: 166397615
---
Commit 989713f26 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
BEGIN_PUBLIC
Automated g4 rollback of changelist 166294015
PiperOrigin-RevId: 166521502
Diffstat (limited to 'tensorflow/contrib/reduce_slice_ops')
4 files changed, 36 insertions, 42 deletions
diff --git a/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.cc b/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.cc index 2def4f3f17..c33804906f 100644 --- a/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.cc +++ b/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.cc @@ -15,8 +15,8 @@ limitations under the License. #define EIGEN_USE_THREADS -#include "tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.h" #include <algorithm> +#include "tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.h" #include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/register_types.h" diff --git a/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.h b/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.h index c62a7b20d6..fc3a2da9b3 100644 --- a/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.h +++ b/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.h @@ -16,10 +16,10 @@ limitations under the License. #ifndef THIRD_PARTY_TENSORFLOW_CORE_KERNELS_PARTIAL_REDUCTION_OPS_H_ #define THIRD_PARTY_TENSORFLOW_CORE_KERNELS_PARTIAL_REDUCTION_OPS_H_ -#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor" #include "tensorflow/core/framework/tensor.h" #include "tensorflow/core/framework/tensor_shape.h" #include "tensorflow/core/framework/tensor_types.h" +#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor" #define Sum(a, b) ((a) + (b)) #define Prod(a, b) ((a) * (b)) @@ -58,11 +58,11 @@ inline T negative_infinity() { } // namespace reduce_functions -#define CALL_ALL_REDUCEOPS(func, ...) \ - func(Sum, functor::reduce_functions::zero, ##__VA_ARGS__) \ - func(Prod, functor::reduce_functions::one, ##__VA_ARGS__) func( \ - Max, functor::reduce_functions::negative_infinity, ##__VA_ARGS__) \ - func(Min, functor::reduce_functions::infinity, ##__VA_ARGS__) +#define CALL_ALL_REDUCEOPS(func, ...) \ + func(Sum, functor::reduce_functions::zero, ##__VA_ARGS__) \ + func(Prod, functor::reduce_functions::one, ##__VA_ARGS__) \ + func(Max, functor::reduce_functions::negative_infinity, ##__VA_ARGS__) \ + func(Min, functor::reduce_functions::infinity, ##__VA_ARGS__) #define ReduceSliceFunctorReduceop(reduceop, dummy) \ template <typename Device, typename T, typename Index> \ diff --git a/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops_gpu.cu.cc b/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops_gpu.cu.cc index 8b205f7dd5..8e6870fadd 100644 --- a/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops_gpu.cu.cc +++ b/tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops_gpu.cu.cc @@ -17,10 +17,10 @@ limitations under the License. #define EIGEN_USE_GPU -#include "tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.h" #include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/register_types.h" +#include "tensorflow/contrib/reduce_slice_ops/kernels/reduce_slice_ops.h" #include "tensorflow/core/util/cuda_kernel_helper.h" namespace tensorflow { @@ -68,9 +68,8 @@ namespace functor { if (sizex * sizey * sizez == 0) { \ return; \ } \ - Cuda3DLaunchConfig config = GetCuda3DLaunchConfig( \ - sizex, sizey, sizez, d, ReduceSliceDeviceKernel##reduceop<T, Index>, \ - 0, 0); \ + Cuda3DLaunchConfig config = GetCuda3DLaunchConfig(sizex, sizey, sizez, d,\ + ReduceSliceDeviceKernel##reduceop<T, Index>, 0, 0); \ \ ReduceSliceDeviceKernel##reduceop<T, Index> \ <<<config.block_count, config.thread_per_block, 0, d.stream()>>>( \ diff --git a/tensorflow/contrib/reduce_slice_ops/python/kernel_tests/reduce_slice_ops_test.py b/tensorflow/contrib/reduce_slice_ops/python/kernel_tests/reduce_slice_ops_test.py index 8c8db295ff..60a193db4c 100644 --- a/tensorflow/contrib/reduce_slice_ops/python/kernel_tests/reduce_slice_ops_test.py +++ b/tensorflow/contrib/reduce_slice_ops/python/kernel_tests/reduce_slice_ops_test.py @@ -39,48 +39,44 @@ class ReduceSliceTest(TensorFlowTestCase): def testReduceSliceSum2D(self): x = np.array([[1, 2, 3], [40, 50, 60], [700, 800, 900]], dtype=np.int32) indices = np.array([[0, 1], [0, 3], [1, 2], [1, 3], [0, 2]], dtype=np.int32) - result = np.array( - [[1, 2, 3], [741, 852, 963], [40, 50, 60], [740, 850, 960], - [41, 52, 63]], - dtype=np.int32) + result = np.array([[1, 2, 3], [741, 852, 963], [40, 50, 60], + [740, 850, 960], [41, 52, 63]], dtype=np.int32) with self.test_session(use_gpu=True): y_tf = reduce_slice_ops.reduce_slice_sum(x, indices, 0).eval() self.assertAllEqual(y_tf, result) def testReduceSliceSum3D(self): - x = np.array( - [[[1, 2], [3, 4]], [[50, 60], [70, 80]], [[600, 700], [800, 900]]], - dtype=np.int32) + x = np.array([[[1, 2], [3, 4]], [[50, 60], [70, 80]], + [[600, 700], [800, 900]]], dtype=np.int32) indices = np.array([[0, 1], [0, 3], [1, 2], [1, 3], [0, 2]], dtype=np.int32) - result = np.array( - [[[1, 2], [3, 4]], [[651, 762], [873, 984]], [[50, 60], [70, 80]], - [[650, 760], [870, 980]], [[51, 62], [73, 84]]], - dtype=np.int32) + result = np.array([[[1, 2], [3, 4]], + [[651, 762], [873, 984]], + [[50, 60], [70, 80]], + [[650, 760], [870, 980]], + [[51, 62], [73, 84]]], dtype=np.int32) with self.test_session(use_gpu=True): y_tf = reduce_slice_ops.reduce_slice_sum(x, indices, 0).eval() self.assertAllEqual(y_tf, result) def testReduceSliceSumAxis1(self): - x = np.transpose( - np.array([[1, 2, 3], [40, 50, 60], [700, 800, 900]], dtype=np.int32)) + x = np.transpose(np.array([[1, 2, 3], [40, 50, 60], + [700, 800, 900]], dtype=np.int32)) indices = np.array([[0, 1], [0, 3], [1, 2], [1, 3], [0, 2]], dtype=np.int32) - result = np.transpose( - np.array( - [[1, 2, 3], [741, 852, 963], [40, 50, 60], [740, 850, 960], - [41, 52, 63]], - dtype=np.int32)) + result = np.transpose(np.array([[1, 2, 3], + [741, 852, 963], + [40, 50, 60], + [740, 850, 960], + [41, 52, 63]], dtype=np.int32)) with self.test_session(use_gpu=True): y_tf = reduce_slice_ops.reduce_slice_sum(x, indices, 1).eval() self.assertAllEqual(y_tf, result) def testReduceSliceSum1DIndices(self): - x = np.array( - [[1, 2, 3], [40, 50, 60], [700, 800, 900], [1000, 2000, 3000], - [40000, 50000, 60000]], - dtype=np.int32) + x = np.array([[1, 2, 3], [40, 50, 60], [700, 800, 900], + [1000, 2000, 3000], [40000, 50000, 60000]], dtype=np.int32) indices = np.array([0, 0, 2, 5], dtype=np.int32) - result = np.array( - [[0, 0, 0], [41, 52, 63], [41700, 52800, 63900]], dtype=np.int32) + result = np.array([[0, 0, 0], [41, 52, 63], + [41700, 52800, 63900]], dtype=np.int32) with self.test_session(use_gpu=True): y_tf = reduce_slice_ops.reduce_slice_sum(x, indices, 0).eval() self.assertAllEqual(y_tf, result) @@ -88,9 +84,8 @@ class ReduceSliceTest(TensorFlowTestCase): def testReduceSliceProd(self): x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.int32) indices = np.array([[0, 1], [0, 3], [1, 2], [1, 3], [0, 2]], dtype=np.int32) - result = np.array( - [[1, 2, 3], [28, 80, 162], [4, 5, 6], [28, 40, 54], [4, 10, 18]], - dtype=np.int32) + result = np.array([[1, 2, 3], [28, 80, 162], [4, 5, 6], + [28, 40, 54], [4, 10, 18]], dtype=np.int32) with self.test_session(use_gpu=True): y_tf = reduce_slice_ops.reduce_slice_prod(x, indices, 0).eval() self.assertAllEqual(y_tf, result) @@ -98,8 +93,8 @@ class ReduceSliceTest(TensorFlowTestCase): def testReduceSliceMax(self): x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.int32) indices = np.array([[0, 1], [0, 3], [1, 2], [1, 3], [0, 2]], dtype=np.int32) - result = np.array( - [[1, 2, 3], [7, 8, 9], [4, 5, 6], [7, 8, 9], [4, 5, 6]], dtype=np.int32) + result = np.array([[1, 2, 3], [7, 8, 9], [4, 5, 6], + [7, 8, 9], [4, 5, 6]], dtype=np.int32) with self.test_session(use_gpu=True): y_tf = reduce_slice_ops.reduce_slice_max(x, indices, 0).eval() self.assertAllEqual(y_tf, result) @@ -107,8 +102,8 @@ class ReduceSliceTest(TensorFlowTestCase): def testReduceSliceMin(self): x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.int32) indices = np.array([[0, 1], [0, 3], [1, 2], [1, 3], [0, 2]], dtype=np.int32) - result = np.array( - [[1, 2, 3], [1, 2, 3], [4, 5, 6], [4, 5, 6], [1, 2, 3]], dtype=np.int32) + result = np.array([[1, 2, 3], [1, 2, 3], [4, 5, 6], + [4, 5, 6], [1, 2, 3]], dtype=np.int32) with self.test_session(use_gpu=True): y_tf = reduce_slice_ops.reduce_slice_min(x, indices, 0).eval() self.assertAllEqual(y_tf, result) |