tensorflow - machine learning framework

	Commit message (Collapse)	Author	Age
*	[TF] Add new internal ops _VarHandlesOp and _ReadVariablesOp.	Peter Hawkins	2018-09-26
\| \| \| \| \| \| \| \|	The purpose of these ops is to fix a latency problem observed for an inference benchmark. Often a inference step starts by reading the value of many (hundreds) of weights. For a resource variable, this requires a VarHandleOp and a ReadVariableOp per variable. Running hundreds of trivial ops can add hundreds of microseconds of latency to the critical path of an inference step. The inter-op latency of the executor can be hundreds of nanoseconds, which rapidly adds up. This change introduces two fused ops _VarHandlesOp and _ReadVariablesOp that allow us to read many variables in a pair of larger ops, rather than many tiny ops. PiperOrigin-RevId: 214662338
*	Register DestroyResourceOp for XLA devices	Igor Ganichev	2018-07-12
\| \| \| \| \| \| \|	Before this change, we were not releasing device memory allocated by ResourceVariables. PiperOrigin-RevId: 204329027
*	Don't XLA-compile naked variable reads	Igor Ganichev	2018-05-24
	Before this change, when we executed a naked variable read (i.e. outside of a defun, directly running <xla_device>->Compute()), tf2xla kernel would copy the variable's tensor leading to many unnecessary copies. This change uses the regular non-tf2xla kernel for naked variable reads and marks the tf2xla one for CompilationOnly(). PiperOrigin-RevId: 197976146