aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/tools/dist_test/server/Dockerfile
Commit message (Collapse)AuthorAge
* Use 'LABEL maintainer=' in Dockerfile (#13961)Gravatar Yong Tang2017-10-24
| | | | | | | | | | | | | | | | | * Use 'LABEL maintainer=' in Dockerfile This fix is a follow up of 13661 to replace `MAINTAINER` with `LABEL maintainer=` in Dockerfile. The keyword `MAINTAINER` has long been deprecated and is replaced by `LABEL`, which is much more flexible and is easily searchable through `docker inspect`. This fix replaces remaining `MAINTAINER` with `LABEL`. Signed-off-by: Yong Tang <yong.tang.github@outlook.com> * Additional `MAITAINER` -> `LABEL` Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Upgrade docker images for distributed testing to unbutu:16.04 (#7580)Gravatar Shanqing Cai2017-02-16
|
* Let build_server.sh take whl file URL as an input argument. (#5206)Gravatar Shanqing Cai2016-10-25
| | | | | | | | | | | | | | | | | | | | | | | | This make it possible to test OSS GRPC distributed runtime in dist_test/remote_test.sh against a release build. Usage example: 1. Build the server using a release whl file. (Obviously this means that the Linxu CPU PIP release build has to pass first.) $ export DOCKER_VERSION_TAG="0.11.0rc1" $ tensorflow/tools/dist_test/build_server.sh tensorflow/tf_grpc_test_server:${DOCKER_VERSION_TAG} http://ci.tensorflow.org/view/Release/job/release-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-${DOCKER_VERSION_TAG}-cp27-none-linux_x86_64.whl --test 2. Run remote_test.sh: $ export TF_DIST_DOCKER_NO_CACHE=1 $ export TF_DIST_SERVER_DOCKER_IMAGE="tensorflow/tf_grpc_test_server:${DOCKER_VERSION_TAG}" $ export TF_DIST_GCLOUD_PROJECT="my-project" $ export TF_DIST_GCLOUD_COMPUTE_ZONE="my-zone" $ export TF_DIST_CONTAINER_CLUSTER="my-cluster" $ export TF_DIST_GCLOUD_KEY_FILE="/path/to/my/key.json" $ tensorflow/tools/dist_test/remote_test.sh "http://ci.tensorflow.org/view/Release/job/release-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-${DOCKER_VERSION_TAG}-cp27-none-linux_x86_64.whl"
* Update version strings from 0.11.0rc0 to 0.11.0rc1.Gravatar Yifei Feng2016-10-14
|
* Update version string to 0.11.rc0Gravatar Gunhan Gulsoy2016-09-23
|
* Update artifact version in tools/dist_testGravatar Shanqing Cai2016-09-16
|
* Update URLs to nightly build artifacts and history pagesGravatar Shanqing Cai2016-09-06
| | | | Change: 132313453
* Update jenkins URL to httpsGravatar Shanqing Cai2016-08-19
| | | | Change: 130741928
* Merge changes from github.Gravatar A. Unique TensorFlower2016-07-31
| | | | Change: 128958134
* Merge changes from github.Gravatar A. Unique TensorFlower2016-06-30
| | | | Change: 126335170
* Update copyright for 3p/tf/tools.Gravatar A. Unique TensorFlower2016-06-02
| | | | Change: 123889091
* Merge changes from github.Gravatar A. Unique TensorFlower2016-05-05
| | | | Change: 121586635
* Merge changes from github.Gravatar Illia Polosukhin2016-04-18
| | | | Change: 120185825
* Add SyncReplicasOptimizer test in dist_testGravatar A. Unique TensorFlower2016-04-11
| | | | | | | | | | | | | | Usage example: ./remote_test.sh --num-workers 3 --sync-replicas Also changed: 1) In local and remote tests, let different workers contact separate GRPC sessions. 2) In local and remote tests, adding the capacity to specify the number of workers. Before it was hard-coded at 2. Usage example: ./remote_test.sh --num-workers 2 --sync-replicas 3) Using device setter in mnist_replica.py Change: 119599547
* Test for distributed (grpc) runtime in OSS TensorFlowGravatar A. Unique TensorFlower2016-03-18
See README.md for detailed descriptions of the usage of the tools and tests in this changeset. Three modes of testing are supported: 1) Launch a local Kubernetes (k8s) cluster and run the test suites on it (See local_test.sh) 2) Launch a remote k8s cluster on Google Container Engine (GKE) and run the test suite on it (See remote_test.sh) 3) Run the test suite on an existing k8s TensorFlow cluster (Also see remote_test.sh) Take the remote test for example, the following steps are performed: 1) Builds a Docker image with gcloud and Kubernetes tools, and the latest TensorFlow pip installed (see Dockerfile) 2) Launches a Docker container based on the said image (see test_distributed.sh) 3) From within the image, authenticate the gcloud user (with credentials files mapped from outside the container), configer the k8s cluster and launch a new k8s container cluster for TensorFlow workers 4) Generate a k8s (yaml) config file and user this yaml file to create a TensorFlow worker cluster consisting of a certian number of parameter servers (ps) and workers. The workers are exposed as external services with public IPs (see dist_test.sh) 5) Run a simple softmax MNIST model on multiple workers, with the model weights and biases located on the ps nodes. Train the models in parallel and observe the final validation cross entropy (see dist_mnist_test.sh) Change: 117543657