tensorflow - machine learning framework

	Commit message (Collapse)	Author	Age
*	Collective Ops Part 8	A. Unique TensorFlower	2018-06-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable collective op execution in distibuted mode: Pass collective_graph_key into graph building and step execution contexts (MasterSession) where it triggers allocation of an RpcCollectiveExecutorMgr that becomes accessible via the WorkerEnv and MasterEnv. The collective_graph_key is used to synchronize step_ids (which are otherwise random) between otherwise independent graph executions that contain collective ops that need to rendezvous. All APIs for using collectives are still non-public and experimental. PiperOrigin-RevId: 199879087
*	Collective Ops Part 6	A. Unique TensorFlower	2018-05-09
\| \| \| \| \| \| \| \| \| \|	Distributed-mode implementations of CollectiveRemoteAccess. Extend Worker interface with corresponding new methods. This change is part of a series of changes introducing infrastructure for collective ops and initial implementations of reduction and broadcast. PiperOrigin-RevId: 196010718
*	Collective Ops Part 5	A. Unique TensorFlower	2018-05-01
\| \| \| \| \| \| \| \| \| \| \|	Distributed-mode implementations of DeviceResolverInterface and ParamResolverInterface. Extend Worker interface with new methods in support of these interfaces. This change is part of a series of changes introducing infrastructure for collective ops and initial implementations of reduction and broadcast. PiperOrigin-RevId: 194984585
*	Add a ten-second timeout to the DeleteWorkerSession call.	Derek Murray	2018-04-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, `MasterSession::Close()` did not block on the cleanup RPCs to the individual workers, leading to deployments where the remote workers might be shut down (e.g. by an external mechanism) before the session was closed. In order to switch over to using DeleteWorkerSession for all sessions, and preserve backwards compatibility, we need to permit this behavior. Therefore, this CL adds a 10-second timeout on the requests to workers, and logs an error if the request does not succeed in that time period. PiperOrigin-RevId: 193441618
*	Never use the LegacySession when a Master explicitly calls CreateWorkerSession.	Derek Murray	2018-04-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, if the session handle was unrecognized by the worker, it would default to using the LegacySession. This prevents us from noticing that a server has been restarted. To address the problem in a backwards-compatible way, we add a bit to each session-handle-carrying worker request, indicating whether the master believes that CreateWorkerSession has been called. If this bit is set and the handle is unrecognized, the worker will raise an AbortedError, which can be caught by high-level frameworks such as `tf.estimator`. Note that CreateWorkerSession is not yet used by default, and a follow-up change will add that. PiperOrigin-RevId: 193427057
*	Avoid capturing unused variables in lambda functions	Benoit Steiner	2018-03-12
\| \| \| \|	PiperOrigin-RevId: 188747641
*	Fix potential use-after-free bugs in the worker with DeleteWorkerSession.	Derek Murray	2018-01-15
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, DeleteWorkerSession was responsible for freeing the WorkerSession owned by the SessionMgr. However, it is possible for other requests to be in-flight on the same session, and requests from the master to be delivered out of order, which leads to the potential for a request to use a WorkerSession after it has been freed. Revise the SessionMgr interface to handle std::shared_ptr<WorkerSession> instead of raw pointers to avoid this risk. PiperOrigin-RevId: 181975078
*	Optionally store the status code/message in the response	A. Unique TensorFlower	2017-12-27
\| \| \| \| \| \| \|	body for RunGraph and RunStep RPCs, to workaround the fact that the RPC subsystem truncates long metadata messages. PiperOrigin-RevId: 180203356
*	Add `ConfigProto.isolate_session_state` option for the distributed runtime.	Derek Murray	2017-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Setting this option to true when creating a session ensures that no stateful resources (variables, queues, iterators, etc.) will be visible to any other session running on the same server, and those resources will be deleted when the session is closed. The default behavior, namely that all `tf.Variable` objects are shared by default and most other resources are shared when their `shared_name` attr is non-empty, is preserved. This change augments the semantics of the WorkerService.CreateWorkerSession RPC. Now, if the server_def in the request is empty, it implies that the worker should use its default ClusterSpec. Note that clusters created using ClusterSpec propagation always have isolated session state, and are unaffected by this change. PiperOrigin-RevId: 177173545
*	Add `WorkerService.DeleteWorkerSession` method to fix a memory leak.	Derek Murray	2017-11-15
\| \| \| \| \| \| \| \|	The new method is the counterpart to `WorkerService.CreateWorkerSession`, and is called in all cases where worker sessions have been explicitly created (i.e. when using ClusterSpec propagation). PiperOrigin-RevId: 175877407
*	OOM error with allocation information.	A. Unique TensorFlower	2017-11-13
\| \| \| \|	PiperOrigin-RevId: 175637128
*	Track memory allocation/deallocation history.	A. Unique TensorFlower	2017-10-05
\| \| \| \|	PiperOrigin-RevId: 171239477
*	Allowing for functions to run across processes using RPC's. Currently this ↵	Rohan Jain	2017-10-02
\| \| \| \| \| \|	only works for processes running on CPU's only. PiperOrigin-RevId: 170725482
*	Merge changes from github.	Jonathan Hseu	2017-08-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	END_PUBLIC --- Commit b30ce4714 authored by James Qin<jamesqin@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Revamp CudnnRNN Saveables 1. Use a lossy way to save/restore cudnn biases during checkpointing. Cudnn uses 2 biases each gate for all RNNs while tf uses one. To allow cudnn checkpoints to be compatible with both Cudnn and platform-independent impls, previously both individual bias and summed biases each gate were stored. The new way only stores the bias sum for each gate, and split it half-half when restoring from a cudnn graph. Doing this does not cause problems since RNNs do not use weight-decay to regularize. 2. Use inheritance instead of branching * Split RNNParamsSaveable to 1 base class and 4 subclasses. * Extract common routines and only overwrite rnn-type-specific pieces in subclasses. PiperOrigin-RevId: 166413989 --- Commit ebc421daf authored by Alan Yee<alyee@ucsd.edu> Committed by Jonathan Hseu<vomjom@vomjom.net>: Update documentation for contrib (#12424) * Update __init__.py Remove ## for standardization of api docs * Create README.md Add README to define this directory's purpose * Update __init.py Markdown styling does not show up well in api docs * Update README.md Add short mention of describing what to deprecate * Update README.md Capitalize title * Update README.md Revert README change * Delete README.md --- Commit fd295394d authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Use latest version of nsync library, which now allows use of cmake on MacOS. PiperOrigin-RevId: 166411437 --- Commit 587d728e0 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: [XLA] Refactor reduce-precision-insertion filters, add several more options. In particular, this adds the ability to add reduce-precision operations after fusion nodes based on the contents of those fusion nodes, and the ability to filter operations based on the "op_name" metadata. PiperOrigin-RevId: 166408392 --- Commit 3142f8ef5 authored by Ali Yahya<alive@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Steps toward making ResourceVariables compatible with Eager. This change forces the value of the reuse flag in variable scopes to be tf.AUTO_REUSE when in Eager mode. This change also adds comprehensive Eager tests for ResourceVariable. PiperOrigin-RevId: 166408161 --- Commit b2ce45150 authored by Igor Ganichev<iga@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Make Graph::IsValidNode public It can be reimplemented with existing public APIs, but instead of doing so, making this one public seems better. PiperOrigin-RevId: 166407897 --- Commit 0a2f40e92 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: [XLA::CPU] Fix HLO profiling in parallel CPU backend. PiperOrigin-RevId: 166400211 --- Commit c4a58e3fd authored by Yao Zhang<yaozhang@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Identify frame ids for all nodes in a graph. PiperOrigin-RevId: 166397615 --- Commit 989713f26 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: BEGIN_PUBLIC Automated g4 rollback of changelist 166294015 PiperOrigin-RevId: 166521502
*	Add output_partitions support in distributed runtime.	Suharsh Sivakumar	2017-07-19
\| \| \| \|	PiperOrigin-RevId: 162456565
*	Performance-related tweaks: Don't copy loop variables; remove ineffective ↵	A. Unique TensorFlower	2017-06-05
\| \| \| \| \| \|	std::move casts. PiperOrigin-RevId: 158017670
*	Refactor partial run state handling into partial_run_mgr.	Suharsh Sivakumar	2017-05-19
\| \| \| \|	PiperOrigin-RevId: 156529141
*	Implement ClusterSpec Propagation in TF Master	Brennan Saeta	2017-05-04
\| \| \| \| \| \| \| \| \| \| \|	ClusterSpec propagation is a capability upgrade for TensorFlow that should make it much easier to (1) build distributed TensorFlow clusters, and (2) handle node failures. The ClusterSpec propagation capability allows TensorFlow workers to be booted independently of each other, and with no knowledge about others. The client can then construct a ClusterDef (ClusterSpec), and then send it to the TF master at session creation. The master in turn then propagates the ClusterDef along to all of the workers. Change: 155159972
*	Add TFDBG support to GrpcSession	Shanqing Cai	2017-04-29
\| \| \| \| \| \| \| \|	* Along the way, unify the way the debugger works in DirectSession (non-distributed Sessions) and MasterSession (for distributed Sessions). * The SummarizDebugTensorWatches method is invoked in DirectSession::GetOrCreateExecutors() and MasterSession::HashBuildGraphOptions() method to generate keys for partition graphs and executors. * The DebugStateInterface::PublishDebugMetadata() method is used to send metadata about the debugged Session::Run() call to debug URLs. This happens in DirectSession::Run() and MasterSession::DoRunWithLocalExecution() respectively. * The DebugGraphDecoratorInterface::DecorateGraph() and DebugGraphDecoratorInterface::PublishGraph() methods are used to insert debug ops to the debugged graph and send the modified graph to debug URLs. This happens in DirectSession::GetOrCreateExecutors() and GraphMgr::InitItem(), respectively. Change: 154631802
*	Change calls to use status.Update.	Suharsh Sivakumar	2017-03-31
\| \| \| \|	Change: 151899404
*	Consolidate worker state behind a session-centric abstraction.	Brennan Saeta	2017-03-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	State in workers is currently splayed across graph_mgr, rendezvous_mgr, and additional components. This has resulted in it being difficult to ensure proper cleanup and shut down of the worker components. In addition to paving the way for a more reliable shut down, this CL also sets up the beginnings of ClusterSpec propagation. ClusterSpec propagation is a capability upgrade for TensorFlow that should make it much easier to (1) build distributed TensorFlow clusters, and (2) handle node failures. After the ClusterSpec propagation capability is fully implemented, the TensorFlow workers can be booted independently of each other, and with no knowledge about others. A client can then query a central cluster scheduler or other API to find all of the workers, and then send the ClusterDef (ClusterSpec) to the TF master, which then propagates that along to all of the workers. This change is only the first of a sequence to fully implement ClusterSpec propagation in TensorFlow. Change: 151229111
*	Ensure that partial run doesn't block any threads on the worker compute_pool.	Suharsh Sivakumar	2017-03-15
\| \| \| \|	Change: 150265300
*	Fix code that ignores tensorflow::Status.	Peter Hawkins	2017-02-13
\| \| \| \| \|	Add a new tensorflow::Status::IgnoreError() method to mark call sites where a Status has been intentionally ignored. Change: 147402405
*	Provide multiple implementations of RPC responses on the fetch path.	Derek Murray	2017-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL includes wrapper classes for the protocol buffer messages `tensorflow::RunStepResponse` and `tensorflow::RunGraphResponse` (to complement the corresponding request message wrappers that were added recently). This change makes the backend code deal with abstract `tensorflow::MutableRunStepResponseWrapper` and `tensorflow::MutableRunGraphResponseWrapper` interfaces and adds three concrete implementations of each interface: * A mutable in-memory wrapper, which maintains the tensor data in `tensorflow::Tensor` objects, and provides the most efficient implementation when the client and master (or master and worker) or in the same address space. * A mutable, owned protobuf wrapper, which has a similar implementation to today's client code. * A mutable, non-owned protobuf wrapper, which has a similar implementation to today's server code (where the protobuf message is owned by the RPC subsystem). This is another improvement for issue #6256. Change: 144481118
*	Provide multiple implementations of RPC requests on the feed path.	Derek Murray	2017-01-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL includes wrapper classes for the protocol buffer messages `tensorflow::RunStepRequest` and `tensorflow::RunGraphRequest`. Previously the service arguments were always protocol buffer messages, which can entail copying large tensor values into and out of the request message. This change makes the backend code deal with abstract `tensorflow::RunStepRequestWrapper` and `tensorflow::RunGraphRequestWrapper` interfaces and adds three concrete implementations of each interface: * An mutable in-memory wrapper, which maintains the tensor data in `tensorflow::Tensor` objects, and provides the most efficient implementation when the client and master are in the same address space. * A mutable protobuf wrapper, which has a similar implementation to today's client code. * A const wrapper around a const protobuf, which has a similar implementation to today's server code. This is another improvement for issue #6256. Change: 143620823
*	Combine NamedTensorProto and NamedTensor into a single proto.	Derek Murray	2016-12-22
\| \| \| \| \| \| \|	Also use `Tensor::AsProtoTensorContent()` when populating the fetched values from a gRPC worker service, as this is more efficient for larger values. This should improve #6256 slightly. Change: 142813084
*	Optimize the case of a master communicating with an in-process worker.	Derek Murray	2016-12-22
	This change modifies the GrpcWorkerCache so that, when a master attempts to communicate with the worker in the same process, it does so by direct method calls on a `WorkerInterface`, without making a loopback RPC call. This change is another incremental step towards addressing issue #6256. There are further improvements possible, and we will continue to investigate them, including: Avoiding the protobuf encoding/decoding for request/response objects where this affects performance. The zero-copy `TensorResponse` class is an example of how we could improve performance here, for `RunGraphRequest` and `RunGraphResponse` objects. * Profiling the closure creation/context switch overhead for interactions with the local worker. Change: 142793965