diff options
author | Peter Hawkins <phawkins@google.com> | 2018-09-26 13:48:21 -0700 |
---|---|---|
committer | TensorFlower Gardener <gardener@tensorflow.org> | 2018-09-26 13:51:50 -0700 |
commit | 1736e0bbbfdeeba178dff37c970b5a0180ee013f (patch) | |
tree | 390c309b5997a752644d2c50bb4ee5bf8fc1654d /tensorflow/compiler | |
parent | 652ce1aaefdadd04a9905a0788ab26c6fff93658 (diff) |
[TF] Add new internal ops _VarHandlesOp and _ReadVariablesOp.
The purpose of these ops is to fix a latency problem observed for an inference benchmark. Often a inference step starts by reading the value of many (hundreds) of weights. For a resource variable, this requires a VarHandleOp and a ReadVariableOp per variable. Running hundreds of trivial ops can add hundreds of microseconds of latency to the critical path of an inference step. The inter-op latency of the executor can be hundreds of nanoseconds, which rapidly adds up.
This change introduces two fused ops _VarHandlesOp and _ReadVariablesOp that allow us to read many variables in a pair of larger ops, rather than many tiny ops.
PiperOrigin-RevId: 214662338
Diffstat (limited to 'tensorflow/compiler')
-rw-r--r-- | tensorflow/compiler/jit/xla_device_ops.h | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/tensorflow/compiler/jit/xla_device_ops.h b/tensorflow/compiler/jit/xla_device_ops.h index 2ccee79761..6967ad1f03 100644 --- a/tensorflow/compiler/jit/xla_device_ops.h +++ b/tensorflow/compiler/jit/xla_device_ops.h @@ -100,9 +100,15 @@ class XlaAssignVariableOp : public AsyncOpKernel { Name("VarHandleOp").Device(DEVICE).HostMemory("resource"), \ ResourceHandleOp<Var>); \ REGISTER_KERNEL_BUILDER( \ + Name("_VarHandlesOp").Device(DEVICE).HostMemory("resources"), \ + ResourceHandlesOp<Var>); \ + REGISTER_KERNEL_BUILDER( \ Name("ReadVariableOp").Device(DEVICE).HostMemory("resource"), \ ReadVariableOp); \ REGISTER_KERNEL_BUILDER( \ + Name("_ReadVariablesOp").Device(DEVICE).HostMemory("resources"), \ + ReadVariablesOp); \ + REGISTER_KERNEL_BUILDER( \ Name("DestroyResourceOp").Device(DEVICE).HostMemory("resource"), \ DestroyResourceOp); \ REGISTER_KERNEL_BUILDER(Name("Shape") \ |