aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/tools/dist_test/scripts/k8s_tensorflow.py
diff options
context:
space:
mode:
authorGravatar Shanqing Cai <cais@google.com>2016-09-14 19:00:00 -0800
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2016-09-14 20:03:21 -0700
commit65b010308c2ab3f365b5b9b40dd56591b179b996 (patch)
treeec6f50c7190b8d33fc1316fd6ad507ac0c949b2a /tensorflow/tools/dist_test/scripts/k8s_tensorflow.py
parent57e23cadaed1cfd5245192ee44e8f89713ca01e5 (diff)
Update & fix OSS distributed TF tests: mnist_replica
1) Replace the old and breaking docker-in-docker local test with a single-instance, multi-process test, built upon GitHub PR https://github.com/tensorflow/tensorflow/pull/3935 This simplifies the local test and makes it less susceptible to future changes in docker-in-docker support by docker. 2) Adding --existing_servers flag to mnist_replica.py and associated bash scripts, so that we can distinguish a) the case in which we want to create in-process servers and supervisors (as in the new local_test.sh), and b) the case in which GRPC TF servers are already created and we just want to connect to the workers (as in remote_test.sh). 3) Rename some flags in bash script to improve consistency with the mnist_replica.py. 4) Related doc changes in README.md. Change: 133209130
Diffstat (limited to 'tensorflow/tools/dist_test/scripts/k8s_tensorflow.py')
-rwxr-xr-xtensorflow/tools/dist_test/scripts/k8s_tensorflow.py19
1 files changed, 17 insertions, 2 deletions
diff --git a/tensorflow/tools/dist_test/scripts/k8s_tensorflow.py b/tensorflow/tools/dist_test/scripts/k8s_tensorflow.py
index 3a427a1d4e..854c6b832a 100755
--- a/tensorflow/tools/dist_test/scripts/k8s_tensorflow.py
+++ b/tensorflow/tools/dist_test/scripts/k8s_tensorflow.py
@@ -136,6 +136,19 @@ spec:
selector:
tf-ps: "{param_server_id}"
""")
+PARAM_LB_SVC = ("""apiVersion: v1
+kind: Service
+metadata:
+ name: tf-ps{param_server_id}
+ labels:
+ tf-ps: "{param_server_id}"
+spec:
+ type: LoadBalancer
+ ports:
+ - port: {port}
+ selector:
+ tf-ps: "{param_server_id}"
+""")
def main():
@@ -218,8 +231,10 @@ def GenerateConfig(num_workers,
num_param_servers,
port))
config += '---\n'
- config += PARAM_SERVER_SVC.format(port=port,
- param_server_id=param_server)
+ if request_load_balancer:
+ config += PARAM_LB_SVC.format(port=port, param_server_id=param_server)
+ else:
+ config += PARAM_SERVER_SVC.format(port=port, param_server_id=param_server)
config += '---\n'
return config