aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core/protobuf/master_service.proto
diff options
context:
space:
mode:
authorGravatar Derek Murray <mrry@google.com>2016-02-25 20:10:09 -0800
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2016-02-25 20:15:52 -0800
commit00986d48bb646daab659503ad3a713919865f32d (patch)
tree3179208eda8426b346db591f7d98fd836a20f384 /tensorflow/core/protobuf/master_service.proto
parentd27da251bcc4bab7da2f5aecc509b146f9fa1692 (diff)
Initial version of the open-source distributed TensorFlow runtime.
This includes a gRPC server (grpc_tensorflow_server) that can serve as both the master of a distributed TensorFlow computation, and an individual worker in the computation. The GrpcSession class is included to allow client programs (including Python clients) to interact with a server. See tensorflow/core/distributed_runtime/README.md for usage instructions. This change partially addresses issue #23. Change: 115634191
Diffstat (limited to 'tensorflow/core/protobuf/master_service.proto')
-rw-r--r--tensorflow/core/protobuf/master_service.proto105
1 files changed, 105 insertions, 0 deletions
diff --git a/tensorflow/core/protobuf/master_service.proto b/tensorflow/core/protobuf/master_service.proto
new file mode 100644
index 0000000000..13b0a97b11
--- /dev/null
+++ b/tensorflow/core/protobuf/master_service.proto
@@ -0,0 +1,105 @@
+/* Copyright 2016 Google Inc. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+==============================================================================*/
+
+syntax = "proto3";
+
+package tensorflow.grpc;
+option java_outer_classname = "MasterServiceProtos";
+option java_multiple_files = true;
+option java_package = "org.tensorflow.distruntime";
+
+import "tensorflow/core/protobuf/master.proto";
+
+////////////////////////////////////////////////////////////////////////////////
+//
+// MasterService defines a TensorFlow service with which a client can
+// interact to execute a distributed TensorFlow computation.
+//
+// A master service keeps track of multiple "master sessions". Each
+// session encapsulates a computation graph and its associated state,
+// and typically corresponds to a single "client session" (e.g. a
+// `tensorflow::Session` instance).
+//
+// A session is responsible for the following:
+// * assigning each node to a device (locally or remotely) using a
+// placement algorithm. This may make decisions based on collected
+// statistics from the workers in the system (e.g., memory usage,
+// bandwidth consumption, etc.)
+//
+// * inserting intermediate nodes and edges to support cross-device
+// and cross-process data flows and resource management.
+//
+// * issuing commands to workers to execute the subgraphs associated
+// with those workers.
+//
+// Typically, a client carries out an iterative computation
+// (e.g. training) by invoking RPCs against the master in a
+// client-side loop. The client first creates a client session that
+// connects to a particular master (using gRPC for example). The
+// master creates a corresponding master session that is hosted on
+// the master and caches state between the client's invocations.
+//
+// After the session is established, the master returns an opaque
+// handle to the client that can be used to associate the client and
+// master sessions.
+//
+// The client may send an initial graph to the master in the
+// CreateSession call, and add nodes to the graph using ExtendSession.
+//
+// The most frequent operation a master is "RunStep", which implements
+// the `Session::Run()` API. It supports feeding in arguments,
+// executing a dataflow computation, and fetching arguments.
+//
+// Finally, when the client no longer needs the session, it should
+// close the session by invoking CloseSession, which allows the master
+// to reclaim resources associated with the session. The master may
+// implement a garbage collection scheme that closes sessions that
+// have been inactive for some time.
+//
+// For example, the following pseudo-code illustrates how a client
+// interacts with a master:
+//
+// stub = NewStub("/job:mnist/replica:0/task:0")
+// {handle} = stub->CreateSession({graph_def})
+// do {
+// stub->RunStep({handle, {feeds}, {fetches}})
+// // The client can evaluate a predicate locally, based on the
+// // result of `fetches`, to determine whether to terminate. For
+// // example, it might fetch the loss and evaluate whether it is less
+// // than some threshold.
+// } whlie (!should_stop({fetches}));
+// stub->CloseSession({handle})
+//
+////////////////////////////////////////////////////////////////////////////////
+
+service MasterService {
+ // Creates a session.
+ rpc CreateSession(CreateSessionRequest) returns (CreateSessionResponse);
+
+ // Extends a session.
+ rpc ExtendSession(ExtendSessionRequest) returns (ExtendSessionResponse);
+
+ // Drives the graph computation.
+ rpc RunStep(RunStepRequest) returns (RunStepResponse);
+
+ // Closes a session.
+ rpc CloseSession(CloseSessionRequest) returns (CloseSessionResponse);
+
+ // List the devices usable by the master.
+ rpc ListDevices(ListDevicesRequest) returns (ListDevicesResponse);
+
+ // Close all existing sessions.
+ rpc Reset(ResetRequest) returns (ResetResponse);
+}