aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core/lib/strings/numbers.cc
diff options
context:
space:
mode:
authorGravatar Derek Murray <mrry@google.com>2018-04-20 11:13:16 -0700
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2018-04-20 11:16:20 -0700
commit49f3469d9533cb12d06ed3907b4ced975e2fcea4 (patch)
treeca8af76cdfb5f7024b2a998c313afb061870d56b /tensorflow/core/lib/strings/numbers.cc
parentb3f379e907259aa166c1ef734ccfd03331eb0a94 (diff)
Use CreateWorkerSession and DeleteWorkerSession for all distributed sessions.
This change adds a phase to the session creation protocol: the master now contacts all workers to register a session handle and create a "WorkerSession" on each worker before it first registers or runs a graph on any worker. Subsequent requests to a worker ensure that the worker has the session handle registered before performing the request, and an AbortedError is raised if the worker has not (e.g. because it restarted after a failure). As a result, more failure cases are covered by the high-level APIs (tf.estimator, Slim, etc.) that recreate the session on receiving an AbortedError. Previously, there was a possible race condition in which a PS task could restart between variable initialization and the first step, leading to a FailedPreconditionError ("Attempting to use uninitialized value") that would not be handled by the high-level APIs. PiperOrigin-RevId: 193694958
Diffstat (limited to 'tensorflow/core/lib/strings/numbers.cc')
0 files changed, 0 insertions, 0 deletions