aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core/framework/tensor.cc
diff options
context:
space:
mode:
authorGravatar A. Unique TensorFlower <gardener@tensorflow.org>2017-02-21 17:31:57 -0800
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2017-02-21 17:52:15 -0800
commit4891c01b1cadf085a915a3eac5dd1b8d8cdee203 (patch)
tree87ec00e1927877ba26a2ffb69bc4f74f25c36f6a /tensorflow/core/framework/tensor.cc
parent123c2bb0af532d5fdaa05358158da33497d4bfe6 (diff)
Allow (safe) in-place computation in TensorFlow C++ ops. When at least one input tensor has the same size and type as the output, and the underlying buffer is owned by the op, i.e. when its refcount is 1 at the time the op's Compute method executes, the computation can be performed in place and allocation of the output buffer avoided.
I updated the following ops to perform in-place computation automatically when possible: * All standard coefficient-wise unary and binary operators (including with broadcasting) inheriting from base classes in kernels/cwise_ops_common.h. * unary and binary operators inheriting from base classes in framework/numeric_op.h. This is mostly old code for the Relu family and associated gradients. * All linear algebra ops inheriting from linalg_common. * Misc individual files/ops: softmax, select, bias, aggregate ops, batch_norm & fused_batch_norm, adjust_hue, constant, depthwise_conv_grad, fractional_avg_pool, misc. pooling ops, matrix_set_diag, xent & sparse_xent, unique_op. Change: 148166936
Diffstat (limited to 'tensorflow/core/framework/tensor.cc')
-rw-r--r--tensorflow/core/framework/tensor.cc8
1 files changed, 8 insertions, 0 deletions
diff --git a/tensorflow/core/framework/tensor.cc b/tensorflow/core/framework/tensor.cc
index f622d031f2..68c6817448 100644
--- a/tensorflow/core/framework/tensor.cc
+++ b/tensorflow/core/framework/tensor.cc
@@ -526,6 +526,14 @@ void Tensor::UnsafeCopyFromInternal(const Tensor& other, DataType dtype,
}
}
+// Notice that buf_ either points to a regular TensorBuffer or a SubBuffer.
+// For the latter case, we have to make sure that the refcount is
+// one both for the SubBuffer _and_ the underlying TensorBuffer.
+bool Tensor::RefCountIsOne() const {
+ return buf_ != nullptr && buf_->RefCountIsOne() &&
+ buf_->root_buffer()->RefCountIsOne();
+}
+
// The macro CASES() expands to a switch statement conditioned on
// TYPE_ENUM. Each case expands the STMTS after a typedef for T.
#define SINGLE_ARG(...) __VA_ARGS__