tensorflow - machine learning framework

	Commit message (Collapse)	Author	Age
*	Set step_id in Executor Args to the step_id generated in MasterSession.	Ayush Dubey	2018-09-25
\| \| \| \|	PiperOrigin-RevId: 214542049
*	[tf.data] Introducing an optimization that parallelizes map transformations.	Piotr Padlewski	2018-09-14
\| \| \| \| \| \| \| \|	Stateless MapDatasets can be paralellized by switching to ParallelMapDataset. We set `num_parallel_calls` to 2 for now, but in the future a special value will be used that result in the optimal value to be selected dynamically at runtime. This patch also exposed a memory leak which was fixed. PiperOrigin-RevId: 213015223
*	BEGIN_PUBLIC	Ayush Dubey	2018-08-31
\| \| \| \| \| \| \| \| \| \| \|	Rollback of rollback. Fix: make access to collective_graph_key thread-safe. The original change introduced a collective_graph_key_ integer to DirectSession, but it did not protect accesses to this integer. This change protects access with a mutex. END_PUBLIC Automated rollback of commit cb9443831283c2366e3dd91001db6362d6594f66 PiperOrigin-RevId: 211161961
*	Automated rollback of commit 73a3477356990f2451e220f553c9d7782df836ac	Ayush Dubey	2018-08-30
\| \| \| \|	PiperOrigin-RevId: 211037202
*	Initialize collective_graph_key based on the graph if unspecified in RunOptions.	Ayush Dubey	2018-08-30
\| \| \| \| \| \| \| \| \| \| \|	Before this CL, for collective_ops to work, the client had to specify a collective_graph_key in the RunOptions of a session.Run call. After this change, if a client does not specify a collective_graph_key for a graph that contains collective ops, a graph key is generated automatically as a hash of the set of keys of collective instances in the placed graph. PiperOrigin-RevId: 211024617
*	Improve the GPU memory use discipline of CollectiveReduce.	A. Unique TensorFlower	2018-08-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GPU memory allocation can be done in one of two modes: efficient (but complex and therefore somewhat risky) or conservative (simpler, but less efficient). The main difference is that 'efficient' allocation allows the same memory area to be allocated to mutiple independent uses simultaenously, when it should be the case that those uses will in fact be serial and thus temporally disjoint, while 'conservative' allocation will always obey the invarient that one piece of memory is allocated to at most one use at any point in time. If GPUDevice::RequiresRecordingAccessedTensors() returns false, then the TF runtime uses efficient memory allocation for GPU ops. That is, GPU ops are nominally synchronous and their tensor Ref's are deleted immediately after the ops returns although really the corresponding GPU kernel is only guaranteed to have been enqueued on the compute stream and may not have yet begin execution. If RequiresRecordingAccessedTensors() returns true, then conservative memory allocation is used, i.e. Refs on the tensors accessed by a GPU op are held until the corresponding kernel is guaranteed to have completed execution and no part of the op will touch them again. Efficient GPU memory allocation should be safe when the following criteria are all met: 1. All GPU kernels are executed serially on a single compute stream. 2. All GPU kernel outputs and temp buffers are allocated by the GPU Op in the executor thread in which it is originally called. 3. Any read of a GPU tensor computed by a GPU kernel that is not by another kernel on that same GPU first synchronizes on the compute stream that produced it. 4. Any read by a GPU kernel of a value that was not produced by another GPU kernel first synchronizes on the entity that produced it, e.g. a copy stream. 5. All direct allocations of GPU memory that are not for kernel outputs or temp buffers are conservative in duration. 6. Any use of directly allocated GPU memory that is not part of a kernel execution first synchronizes on the compute stream to ensure that any prior granted uses of the same region have expired before this new use. These conditions together should be sufficient for safety, and correspond to established practice, though it may be possible to contrive other sets of rules that are also sufficient. Collective Ops for GPUs are unusual in that they are async (as TF Ops) and they can directly allocate GPU memory in CPU threads that are asynchronous to the launching executor thread. This CL corrects a couple of subtle misuse errors related to conditions 2 and 6. PiperOrigin-RevId: 210841522
*	Removed redundant std::string -> string conversions.	A. Unique TensorFlower	2018-08-28
\| \| \| \|	PiperOrigin-RevId: 210596417
*	Allow child class of Server to supply custom ChannelArguments	Noah Eisen	2018-08-21
\| \| \| \|	PiperOrigin-RevId: 209685137
*	fix C++ header guards.	A. Unique TensorFlower	2018-08-21
\| \| \| \|	PiperOrigin-RevId: 209679086
*	[Distributed] Add methods to WorkerCache that selectively list workers by ↵	Derek Murray	2018-08-21
\| \| \| \| \| \|	job name. PiperOrigin-RevId: 209597829
*	Merge pull request #20549 from naurril:bug-fix-grpc-server	TensorFlower Gardener	2018-08-10
\|\ \| \| \| \| \| \|	PiperOrigin-RevId: 208266944
* \|	Fix keep alive stale condition.	Akshay Modi	2018-08-10
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 208254124
* \|	Support keep alive so we can reclaim memory in the remote case.	Akshay Modi	2018-08-08
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 207971672
* \|	Add duplicate detection to RecvBuf requests.	A. Unique TensorFlower	2018-08-04
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 207394440
* \|	In grpc_server_lib.cc initialize master_env_.collective_executor_mgr	A. Unique TensorFlower	2018-07-25
\| \| \| \| \| \| \| \| \| \| \| \|	from the worker_env_ value. PiperOrigin-RevId: 205987011
* \|	Push tensors from client to workers.	Akshay Modi	2018-07-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At times, a server cannot open a reverse connection to the client. This is required when using the _Send/_Recv ops and the client needs to send a tensor to the server (tensors are pulled). Instead, this adds a way to push the tensors directly from the client. Currently, pushing tensors always happens in sync mode. PiperOrigin-RevId: 205888825
* \|	Remove unnecessary thread pool and use the worker env's compute pool directly.	Akshay Modi	2018-07-23
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 205756865
* \|	Automated rollback of commit 2936833c7e22c102ff2b82e3f4e261b94602fbcc	Reed Wanderman-Milne	2018-07-17
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 204981602
* \|	Automated rollback of commit d98b99d1cd4337ee11e7cbc4c9b6324f0e381502	Reed Wanderman-Milne	2018-07-13
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 204544587
* \|	tfdbg: remove Experimental tags and obsolete library	Shanqing Cai	2018-07-13
\| \| \| \| \| \| \| \| \| \| \| \|	* debug_gateway and the related node_outputs_callback are not used and hence are removed in this CL. PiperOrigin-RevId: 204519574
* \|	Add version of SessionFactory::NewSession that returns Status.	Reed Wanderman-Milne	2018-07-13
\| \| \| \| \| \| \| \| \| \| \| \|	This causes DirectSession to report a better error message if there is an error initializing GPUs. PiperOrigin-RevId: 204498143
* \|	Automated rollback of commit 19a98bf9054d9be58a3293b0390b18288a65a25c	Noah Eisen	2018-07-09
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 203872748
* \|	Allow passing in an IPv6 address in server def.	Akshay Modi	2018-07-09
\| \| \| \| \| \| \| \| \| \| \| \|	I belive this will be required if (when?) the TPUClusterResolver returns IPv6 addresses. PiperOrigin-RevId: 203842540
* \|	Merge changes from github.	Yifei Feng	2018-07-06
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 203518000
\| *	add comments	naurril	2018-07-05
\| \|
\| *	check parameters before init GrpcServer	naurril	2018-07-05
\| \|
* \|	Add distributed model GetStepSequenceAsync implementation to	A. Unique TensorFlower	2018-07-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	distributed_runtiume/RpcCollectiveExecutorMgr. In a distributed environment WorkerInterface is going to call this method at the group leader when fielding a GetStepSequence request from one of the other workers. PiperOrigin-RevId: 203196543
* \|	Make functions defined with tfe.defun respect devices when executing.	Akshay Agrawal	2018-07-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modifies GraphModeFunction to emit PartitionedCall ops instead of Call ops so that the created functions can execute across devices. This should strictly increase the set of functions that tfe.defun can faithfully execute. Previous to this change, functions executed through tfe.defun would ignore device annotations and only run on a single device. It is not yet possible to execute a function across multiple processes. Specifically, this CL: (1) Adds a stateful version of PartitionedCall, (2) Modifies `defun` to emit PartitionedCall or StatefulPartitionedCall by default, (3) Makes `tf.gradients` aware of the existence of `(Stateful)PartitionedCall`, (4) Fixes bugs in PartitionedCallOp related to the placement of resource-touching ops / which args and retvals are always on host memory, and also removes the requirement for args/retvals to be passed through the host. PiperOrigin-RevId: 203164388
\| *	Fix incorrect merge of grpc_server_lib.h.	Michael Case	2018-06-29
\| \|
\| *	Merge commit for internal changes	Michael Case	2018-06-29
\| \|\ \| \|/ \|/\|
\| *	Do not capture variables that may be destroyed before callback finishes.	Ayush Dubey	2018-06-28
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 202370201
\| *	Add GPUOptions::num_dev_to_dev_copy_streams to allow creation of	A. Unique TensorFlower	2018-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	more than one device-to-device copy stream per GPU device. This is an experimental feature that will have no effect unless copy operations explicitly request a stream other than 0, which currently does not occur anywhere in a standard build. Eventually it may be of benefit in the presence of multiple bi-directional concurrent data copies. PiperOrigin-RevId: 202354513
\| *	Fix synchronization across callbacks in collective params initialization.	Ayush Dubey	2018-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During initialization of local collective params, we may issue RPCs to other workers in order to obtain device localities. Currently, we hold a mutex across these RPCs, but we do not ensure that the thread that unlocks the mutex is the same as the one that locked it. This change releases the mutex (InstanceRec::out_mu) before calling GetDeviceLocalitiesAsync. Before releasing out_mu, it marks the mutex unavailable. Any thread that wishes to acquire out_mu must wait on a condition variable if the mutex is unavailable. The callback for GetDeviceLocalitiesAsync marks the mutex as available again and notifies the condition variable. PiperOrigin-RevId: 202346357
\| *	[C++]: Ability to feed and fetch tensors while keeping them in device memory	Asim Shankar	2018-06-28
\| \| \| \| \| \| \| \| \| \| \| \|	when using Session::RunCallable(). PiperOrigin-RevId: 202234757
\| *	Support shapes for remote eager tensor handles.	Akshay Modi	2018-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we respond with the shape, all RPCs will happen sync (note that we may still hide the python overhead, since the op is still scheduled for execution via the eager executor). PiperOrigin-RevId: 202207324
* \|	Merge changes from github.	Mingxing Tan	2018-06-28
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 202585094
* \|	Rewrite master_service in terms of more up-to-date gRPC APIs	Noah Eisen	2018-06-28
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 202544091
* \|	Do not capture variables that may be destroyed before callback finishes.	Ayush Dubey	2018-06-27
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 202370201
* \|	Add GPUOptions::num_dev_to_dev_copy_streams to allow creation of	A. Unique TensorFlower	2018-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	more than one device-to-device copy stream per GPU device. This is an experimental feature that will have no effect unless copy operations explicitly request a stream other than 0, which currently does not occur anywhere in a standard build. Eventually it may be of benefit in the presence of multiple bi-directional concurrent data copies. PiperOrigin-RevId: 202354513
* \|	Fix synchronization across callbacks in collective params initialization.	Ayush Dubey	2018-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During initialization of local collective params, we may issue RPCs to other workers in order to obtain device localities. Currently, we hold a mutex across these RPCs, but we do not ensure that the thread that unlocks the mutex is the same as the one that locked it. This change releases the mutex (InstanceRec::out_mu) before calling GetDeviceLocalitiesAsync. Before releasing out_mu, it marks the mutex unavailable. Any thread that wishes to acquire out_mu must wait on a condition variable if the mutex is unavailable. The callback for GetDeviceLocalitiesAsync marks the mutex as available again and notifies the condition variable. PiperOrigin-RevId: 202346357
* \|	[C++]: Ability to feed and fetch tensors while keeping them in device memory	Asim Shankar	2018-06-26
\| \| \| \| \| \| \| \| \| \| \| \|	when using Session::RunCallable(). PiperOrigin-RevId: 202234757
* \|	Support shapes for remote eager tensor handles.	Akshay Modi	2018-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we respond with the shape, all RPCs will happen sync (note that we may still hide the python overhead, since the op is still scheduled for execution via the eager executor). PiperOrigin-RevId: 202207324
\| *	Merge commit for internal changes	Mingxing Tan	2018-06-22
\| \|\ \| \|/ \|/\|
* \|	Allow dynamic specification of clusters for eager remote execution.	Akshay Modi	2018-06-21
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 201586130
\| *	Merge commit for internal changes	Mingxing Tan	2018-06-21
\| \|\ \| \|/ \|/\|
* \|	Rename tensor_data_is_large to share_tensor_slice_memory	Noah Eisen	2018-06-20
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 201422113
\| *	Merge commit for internal changes	Mingxing Tan	2018-06-20
\| \|\ \| \|/ \|/\|
* \|	Allow setting server def on the eager context, and add the eager service to ↵	Akshay Modi	2018-06-19
\| \| \| \| \| \| \| \| \| \| \| \|	the grpc_tensorflow_server. PiperOrigin-RevId: 201198350
* \|	Merge changes from github.	Akshay Modi	2018-06-18
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 201110240
* \|	Automated g4 rollback of changelist 201011811	Akshay Modi	2018-06-18
\| \| \| \| \| \| \| \|	PiperOrigin-RevId: 201033171