aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core/kernels/resource_variable_ops.h
Commit message (Collapse)AuthorAge
* [TF] Add new internal ops _VarHandlesOp and _ReadVariablesOp.Gravatar Peter Hawkins2018-09-26
| | | | | | | | The purpose of these ops is to fix a latency problem observed for an inference benchmark. Often a inference step starts by reading the value of many (hundreds) of weights. For a resource variable, this requires a VarHandleOp and a ReadVariableOp per variable. Running hundreds of trivial ops can add hundreds of microseconds of latency to the critical path of an inference step. The inter-op latency of the executor can be hundreds of nanoseconds, which rapidly adds up. This change introduces two fused ops _VarHandlesOp and _ReadVariablesOp that allow us to read many variables in a pair of larger ops, rather than many tiny ops. PiperOrigin-RevId: 214662338
* Register DestroyResourceOp for XLA devicesGravatar Igor Ganichev2018-07-12
| | | | | | | Before this change, we were not releasing device memory allocated by ResourceVariables. PiperOrigin-RevId: 204329027
* Don't XLA-compile naked variable readsGravatar Igor Ganichev2018-05-24
Before this change, when we executed a naked variable read (i.e. outside of a defun, directly running <xla_device>->Compute()), tf2xla kernel would copy the variable's tensor leading to many unnecessary copies. This change uses the regular non-tf2xla kernel for naked variable reads and marks the tf2xla one for CompilationOnly(). PiperOrigin-RevId: 197976146