aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/contrib/cluster_resolver
Commit message (Collapse)AuthorAge
* Internal change.Gravatar A. Unique TensorFlower2018-10-04
| | | | PiperOrigin-RevId: 215711454
* Job name should be picked based on the cluster_specGravatar Sourabh Bajaj2018-09-06
| | | | PiperOrigin-RevId: 211833041
* [tpu]: Have a more helpful error message when no TPU name is specified.Gravatar Brennan Saeta2018-08-02
| | | | PiperOrigin-RevId: 207147507
* Remove bytes prefix "b'" from tpu_cluster_resolver error messagesGravatar A. Unique TensorFlower2018-07-27
| | | | PiperOrigin-RevId: 206397604
* Add full cluster resolver package to contrib_py rather than the ↵Gravatar Frank Chen2018-07-11
| | | | | | implementation-less skeleton. PiperOrigin-RevId: 204188173
* Support Cloud TPU Pod in GKE environment.Gravatar A. Unique TensorFlower2018-06-12
| | | | PiperOrigin-RevId: 200251004
* Check to ensure the Cloud TPU is ready before resolving.Gravatar Brennan Saeta2018-06-11
| | | | PiperOrigin-RevId: 200095692
* Typo fix in suggested pip message for tpu cluster resolver.Gravatar Taylor Robie2018-06-05
| | | | PiperOrigin-RevId: 199362908
* Amend cluster resolver error to suggest oauth2client as a possible issue.Gravatar A. Unique TensorFlower2018-06-01
| | | | PiperOrigin-RevId: 198894470
* Adds support for specifying a discovery_service_url (via either a parameter ↵Gravatar Frank Chen2018-05-21
| | | | | | or an environment variable) within TPUClusterResolver PiperOrigin-RevId: 197494335
* [TPU]: If the $TPU_NAME env var is set, fallback to that.Gravatar Brennan Saeta2018-05-10
| | | | PiperOrigin-RevId: 196196939
* Support legacy clustersGravatar Brennan Saeta2018-04-20
| | | | PiperOrigin-RevId: 193735742
* Merge changes from github.Gravatar Michael Case2018-04-10
| | | | PiperOrigin-RevId: 192388250
* [TPUClusterResolver] Start a TFServer when running in GKEGravatar Brennan Saeta2018-04-06
| | | | | | This change allows advanced input pipelines (e.g. StreamingFilesDataset, or split-pipelines that use py_func's) to run in GKE- and GKE-like enviornments. PiperOrigin-RevId: 191897639
* Remove all_opensource_files. It's not needed any more.Gravatar Martin Wicke2018-03-28
| | | | PiperOrigin-RevId: 190878279
* Helpful ImportError messageGravatar Brennan Saeta2018-03-07
| | | | PiperOrigin-RevId: 188261273
* Make sure the string returned is a string in Python 3 because of different ↵Gravatar Frank Chen2018-03-07
| | | | | | string handling processes. PiperOrigin-RevId: 188180206
* Fix broken test (invalid string comparison in py3)Gravatar Christopher Suter2018-03-06
| | | | PiperOrigin-RevId: 188051422
* [TPU Cluster Resolver]: Integrate with GKEGravatar Brennan Saeta2018-03-05
| | | | | | This change integrates the TPUClusterResolver with GKE's support for Cloud TPUs PiperOrigin-RevId: 187961802
* Integrate ClusterResolvers with TPUEstimator.Gravatar Brennan Saeta2018-02-26
| | | | PiperOrigin-RevId: 187047094
* Internal-only change.Gravatar Brennan Saeta2018-02-21
| | | | PiperOrigin-RevId: 186528023
* Tpu cluster resolver only returns TF server addresses for 'HEALTHY' tpu nodes.Gravatar A. Unique TensorFlower2018-02-02
| | | | PiperOrigin-RevId: 184350480
* Fix docs generation for cluster_resolversGravatar Mark Daoust2018-01-31
| | | | | | | | Adds "cluster_resolver_pip" as a dependancy to opensource contrib, and applies a standard `remove_undocumented` to clear extra symbols. Docs are build from a bazel bulid, and without this change the cluster resolvers are not directly accessible in "tf.contirb.cluster_resolver" during the docs build, so they do not get documented. PiperOrigin-RevId: 183993115
* Add functionality to auto-discover project and zone when they are not ↵Gravatar Frank Chen2018-01-17
| | | | | | supplied to the TPUClusterResolver PiperOrigin-RevId: 182270565
* Remove hardcoded discovery document now that the TPU alpha API definitions ↵Gravatar Frank Chen2017-11-22
| | | | | | are public (https://www.googleapis.com/discovery/v1/apis/tpu/v1alpha1/rest). PiperOrigin-RevId: 176710985
* Add a new method `get_master` to `TPUClusterResolver` such that users can ↵Gravatar Frank Chen2017-11-02
| | | | | | | | easily specify the grpc connection string using ClusterResolvers rather than specifying the IP address manually. Also fixes a bug in the `TPUClusterResolverTest` that caused tests to not run at all. PiperOrigin-RevId: 174398488
* BUILD cleanup in contrib/...Gravatar A. Unique TensorFlower2017-10-30
| | | | PiperOrigin-RevId: 173889798
* Update APIs for TPU Cluster Resolver to remove the custom API definition and ↵Gravatar Frank Chen2017-10-03
| | | | | | instead use a standard definition file stored in GCS. PiperOrigin-RevId: 170960877
* Automated g4 rollback of changelist 158565259Gravatar Gunhan Gulsoy2017-09-14
| | | | PiperOrigin-RevId: 168650887
* Make a change to the Cluster Resolver API: If no `credentials` are passed in ↵Gravatar Frank Chen2017-08-08
| | | | | | to the GCE and TPU Cluster Resolvers, then we will use the GoogleCredentials.get_application_default() credentials. If users want to pass in no credentials at all, then they will have to pass in "None" explicitly. PiperOrigin-RevId: 164659129
* The return type of the response in the the Google Cloud TPU APIs is a dict ↵Gravatar Frank Chen2017-08-02
| | | | | | rather than an object, so we need to use dict-syntax to access it. PiperOrigin-RevId: 164033254
* Add __init__.py to the contrib/cluster_resolver directory so that the ↵Gravatar Frank Chen2017-07-31
| | | | | | Cluster Resolver classes within this are visible to open source TensorFlow users. PiperOrigin-RevId: 163733781
* Adds preliminary support for Cloud TPUs with Cluster Resolvers. This aims to ↵Gravatar Frank Chen2017-07-27
| | | | | | allow users to have a better experienec when specifying one or multiple Cloud TPUs for their training jobs by allowing users to use names rather than IP addresses. PiperOrigin-RevId: 163393443
* Merge changes from github.Gravatar Jonathan Hseu2017-07-19
| | | | | | | | | | | | | | | | | END_PUBLIC --- Commit daa67ad17 authored by Jonathan Hseu<vomjom@vomjom.net> Committed by Frank Chen<frankchn@gmail.com>: Remove unittest import (#11596) --- Commit 491beb74c authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: BEGIN_PUBLIC Automated g4 rollback of changelist 162423171 PiperOrigin-RevId: 162541442
* Further BUILD cleanup in tensorflow/contrib/...Gravatar A. Unique TensorFlower2017-07-19
| | | | PiperOrigin-RevId: 162466482
* Adds support for retrieving instances from the Google Compute Engine ↵Gravatar Frank Chen2017-07-10
| | | | | | | | instance group APIs, with support (in conjunction with UnionClusterResolver) for mapping multiple instance groups into one TensorFlow job (see the `testUnionMultipleInstanceRetrieval` test for details). This should simplify creating and using standardized grpc TensorFlow server based instances using Compute Engine instance groups for distributed training. PiperOrigin-RevId: 161443891
* Selected BUILD cleanup in tensorflow/contrib/...Gravatar A. Unique TensorFlower2017-06-18
| | | | PiperOrigin-RevId: 159373397
* Adjust test sizesGravatar A. Unique TensorFlower2017-06-09
| | | | PiperOrigin-RevId: 158565259
* Adds the base for ClusterResolvers, a new way of communicating with and ↵Gravatar Frank Chen2017-06-07
retrieving cluster information for running distributed TensorFlow. Implementations of this class would eventually allow users to simply point TensorFlow at a cluster management endpoint, and TensorFlow will automatically retrieve the host names/IPs and port numbers of TensorFlow workers from the cluster management service. PiperOrigin-RevId: 158358761