aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core
Commit message (Collapse)AuthorAge
* Automated rollback of change 126349886Gravatar Zongheng Yang2016-06-30
| | | | Change: 126374056
* Add a unit test to test DNN weights/biases lookup with correct names from ↵Gravatar A. Unique TensorFlower2016-06-30
| | | | | | saved checkpoints. Change: 126367935
* Automated rollback of change 126348349Gravatar Zongheng Yang2016-06-30
| | | | Change: 126366720
* Move POSIX-dependent tracing function from platform/default -> platform/posix.Gravatar Derek Murray2016-06-30
| | | | Change: 126364522
* Fix sparse_softmax_cross_entropy_with_logits for empty tensorGravatar Geoffrey Irving2016-06-30
| | | | | | If the batch size is zero, we need to avoid calling into Eigen because Eigen will explode. Zero classes is an error. Change: 126359444
* Add C++ shape inference for Pack, Unpack, and Const.Gravatar A. Unique TensorFlower2016-06-30
| | | | | | | | Add GetAttr to shape_inference::InferenceContext. Allow setting NodeDef in shape_inference_testutil INFER calls (with new INFER*_WITH_DEF macro). Fix a bug that caused a crash when an INFER..ERROR macro called a shape inference function that did not return an error. Change: 126350221
* Update ops-related pbtxt files.Gravatar A. Unique TensorFlower2016-06-30
| | | | Change: 126349886
* Support Sparse-Sparse cwise ops; use for tf.sparse_{minimum,maximum}().Gravatar Zongheng Yang2016-06-30
| | | | | | This change adds the CPU kernel and Python ifaces. For now, assumes both operands have the same shapes. Change: 126348349
* Fix initialization issues with Variables whose shape contains a zero.Gravatar A. Unique TensorFlower2016-06-30
| | | | | | | | | Fixes #2099. Tries to give Variables the same behavior as non-Variable tensors in this respect. Useful for not having to special case e.g. coefficients of a feature vector which may sometimes not have any features. Change: 126347791
* Raise an Unimplemented error when using MasterSession with place_pruned_graphs.Gravatar Derek Murray2016-06-30
| | | | | | | This change has the unfortunate side-effect of preventing the use of tf.InteractiveSession with a gRPC session. This is intended as a temporary measure, while we fix the implementation of SimpleGraphExecutionState. Change: 126344674
* Update ops-related pbtxt files.Gravatar A. Unique TensorFlower2016-06-30
| | | | Change: 126344587
* Improved the gradients for tanh and sigmoid. This improves the speed of the ↵Gravatar Benoit Steiner2016-06-30
| | | | | | ptb word model from 6800 to 7800 words per second. Change: 126342788
* Fix stack-use-after-scope detected by asan in gtl testGravatar A. Unique TensorFlower2016-06-30
| | | | | | inserted_count is used by RefCounted which is used by RefCountedVec. So inserted_count must outlive RefCountedVec. Change: 126342639
* Clip the padding for negative values.Gravatar Xiaoqiang Zheng2016-06-30
| | | | Change: 126338283
* Merge changes from github.Gravatar A. Unique TensorFlower2016-06-30
| | | | Change: 126335170
* Add C++ shape inference function for broadcasting binary ops.Gravatar A. Unique TensorFlower2016-06-30
| | | | | | | | This is the same logic as python, with an addition in the case where 1 value is unknown and the other is unknown - in this case, we propagate the unknown input dim instead of a new unknown input dim (this case did not apply in python where None was used for the unknown input). Change: 126308395
* Fix OSS compilation failure in save_op_test.Gravatar Zongheng Yang2016-06-29
| | | | Change: 126263834
* Adds a microbenchmark for Save op.Gravatar Zongheng Yang2016-06-29
| | | | Change: 126255634
* Correct bug in crop_and_resize which caused failures to some tests.Gravatar A. Unique TensorFlower2016-06-29
| | | | Change: 126246458
* Rollback linkstatic removal, unforeseen interaction on some archs.Gravatar A. Unique TensorFlower2016-06-29
| | | | Change: 126219121
* Remove linkstatic=1 from places where in un-needed in the newer Bazel version.Gravatar A. Unique TensorFlower2016-06-29
| | | | | Update the minimum required Bazel version to 0.3.0 which includes the bugfix. Change: 126189429
* Change quantized concat to use the same core function as the concat kernel inGravatar A. Unique TensorFlower2016-06-28
| | | | | | | core tensorflow. This is done by moving the core of concat_lib_cpu.cc into a new .h, and making it templated on a struct that defines the function to copy a range of elements. Change: 126147862
* Python tensorflow.Example parser configuration extractorGravatar Ben Lee2016-06-28
| | | | | | | - Proto definition for configuration - Utility for converting from proto - Visibility change Change: 126145305
* Update ops-related pbtxt files.Gravatar A. Unique TensorFlower2016-06-28
| | | | Change: 126125286
* Added the PriorityQueue and Barrier TF objects.Gravatar Eugene Brevdo2016-06-28
| | | | Change: 126119692
* Update ops-related pbtxt files.Gravatar A. Unique TensorFlower2016-06-28
| | | | Change: 126091563
* Allow RefSwitch to have uninitialized variables as inputs.Gravatar Manjunath Kudlur2016-06-28
| | | | Change: 126086117
* Move explicit instantiations of SetZeroFunctor template defined in ↵Gravatar A. Unique TensorFlower2016-06-28
| | | | | | fill_functor.h to its own compilation unit in fill_functor.cc. Change: 126081539
* Prints out a vlog when done with warmups inside RunWithArgs().Gravatar Zongheng Yang2016-06-27
| | | | | Useful to identify execution stages when looking at logs. Change: 126022244
* Added the ability to restore checkpoints containing 16 bit floats.Gravatar Benoit Steiner2016-06-27
| | | | Change: 126013328
* Adds a "currentThreadIndex" method to Eigen's ThreadPoolDevice. Use it to ↵Gravatar A. Unique TensorFlower2016-06-27
| | | | | | handle per-thread buffer allocation for the tileable executor without resorting to thread_local that is not fully supported on Android. Change: 126009029
* A series of changes to significantly reduce the number of allocationsGravatar A. Unique TensorFlower2016-06-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | done by models distributed across many devices. A small microbenchmark model that runs two banks (A and B) of 30 nodes with a 30x30 full shuffle between them, where each of the nodes in A and in B run with one node on each of the 30 devices (so 30*29+30+30, or ~930 separate RPCs) was showing ~111,000 allocations per iteration of the graph. With the changes here, this is now down to ~64,300 allocations per iteration. Changes include: o DeviceContext::CopyDeviceTensorToCPU and related helper routines: use StringPiece instead of const string& for the tensor name (avoids creating a string in some cases where the caller only has a StringPiece available). o Change some Rendezvous and BaseRemoteRendezvous interfaces to take a 'const Rendezvous::ParsedKey& key', rather than 'const string& key'. In many cases, the callers were already having to parse the key into a ParsedKey, and so we were doing the parsing multiple times at different levels as we processed receiving or sending of a tensor. This reduces the number of times that we parse a key as it flows from a Send node through to a Recv node on another worker. o Changed Rendezvous::ParsedKey so that it makes a copy of the underlying full key, and then uses StringPiece objects to point into this copy for the src_device, dst_device, and edge_name pieces. This turns 3 string allocations into 1 per Rendezvous::ParseKey call. o Added new StringPiece Rendezvous::ParsedKey::FullKey() accessor to return a StringPiece for the underlying full key, and used that in a few places (mostly logging) where that is useful. o In many places, used std::move(function_variable) when assigning to an instance variable. This eliminates a very large number of excess std::function allocations/initializations (~56000 of the baseline allocations were related to std::function setup or cloning, and this is now down to ~11000 after this cl). o In the RPC-based remote workers (StubbyRemoteWorker and GrpcRemoteWorker), changed the code path in RecvTensorAsync to avoid creation of a std::function with 6 arguments unless necessary. There are three cases now handled separately: (a) We're not logging, and we didn't make a copy of the request that we need to free: just use the passed in 'StatusCallback done' object directly, without creating a wrapper std::function object at all (b) We're not logging, but we made a copy of the request that we need to free: we create a simple wrapper std::function that invokes the passed in 'done' callback, and then frees the req_copy request copy object. (c) We're logging: we create the std::function object with all the necessary state to log when the recv has finished. o Changed DeviceMgr::LookupDevice to take a StringPiece, rather than a const string&, and changed the hash table to use StringPiece keys. This allows clients that just have a StringPiece device name in their hand to avoid a string creation to lookup the Device* object. o Changed ExecutorState to use a specialized TaggedNodeReadyQueue that internally uses a gtl::InlinedVector<TaggedNode, 16>, rather than using a std::deque<TaggedNode> for keeping track of nodes ready to execute. This is faster because it avoids allocations entirely if the ready node queue doesn't get bigger than 16, and inlined vectors are generally faster than std::deque, at a minor risk of using more memory if this queue grows to very large numbers of ready nodes (mostly imaginable only in pathological graphs). o In ExecutorState::Process, allocated a single ExecutorState::AsyncState object to keep track of all the state we need to preserve for an asynchronously executed node, rather than keeping this state implicitly via a very large number of arguments to a lamda function. o Added new atomic std::atomic<bool> status_is_ok_ in BaseRemoteRendezvous. This allows us to avoid acquiring the lock when we just want to check if the status is non-OK in BaseRemoteRendezvous::Send and BaseRemoteRendezvous::ValidateDevices. o In GraphMgr::RunAllDone, changed assignment of args.runner to avoid one extra level of std::function indirection (binding the function directly to the ThreadPool::Schedule routine, rather than creating an intermediate lambda function that invokes this inside the body of the lambda. o Added freelist of RpcRecvTensorCall objects in third_party/tensorflow/core/distributed_runtime/rpc/rpc_rendezvous_mgr.cc o Changed third_party/tensorflow/core/framework/rendezvous.cc to keep the hashtable of Item* objects keyed by uint64 (hash of the tensor name), rather than the full-string tensor name. Collisions in the 64-bit hash space should basically never happen. o Sped up DeviceNameUtils::ParseFullName by optimizing for the common ordering of parts of /job, /replica, /task, /device. The parsing code was general enough to handle any order, but did so by comparing the prefixes 4, 3, 2, and 1 times, respectively, rather than 1, 1, 1, and 1 times. o Sped up DeviceNameUtils::SplitDeviceName to avoid extra string copies. Change: 125991891
* Move cupti_wrapper target to platform/default/gpuGravatar A. Unique TensorFlower2016-06-27
| | | | Change: 125975221
* Updated the draw_bounding_box operation to work properly with fp16Gravatar Benoit Steiner2016-06-27
| | | | Change: 125964943
* Update ops-related pbtxt files.Gravatar A. Unique TensorFlower2016-06-27
| | | | Change: 125963264
* Clarify the softmax op documentation.Gravatar Derek Murray2016-06-27
| | | | Change: 125961539
* Update ops-related pbtxt files.Gravatar A. Unique TensorFlower2016-06-26
| | | | Change: 125901975
* Add a variant of mutable hash table that supports tensors as values.Gravatar A. Unique TensorFlower2016-06-26
| | | | | Add support for exporting the contents of a table. Change: 125901929
* TF debugger core callback and DebugGatewayGravatar Shanqing Cai2016-06-25
| | | | | | | This is the first of a series of CLs aimed at implementing the TF debugger (tfdb). This C++ CL adds the node-outputs callback to ProcessOutputs() in ExecutorImpl and provides access to it via the DebugGateway class. This makes it possible to observe intermediate node outputs during a DirectSession.Run() call. Change: 125882979
* Update ops-related pbtxt files.Gravatar A. Unique TensorFlower2016-06-24
| | | | Change: 125838604
* Fix a bug in the handling of Send/Recv of ref edge. The bug shows up in the ↵Gravatar Yuan Yu2016-06-24
| | | | | | corner case that an op has multiple inputs, one of the input is of ref type from a different device, and one other input is also remote but cached for some other reasons. Change: 125837438
* Add support for DequeueUpTo in FIFOQueue and PaddingFIFOQueueGravatar Eugene Brevdo2016-06-24
| | | | Change: 125837171
* Merge changes from github.Gravatar A. Unique TensorFlower2016-06-24
| | | | Change: 125835079
* Added more ops to testlib.Gravatar Jianmin Chen2016-06-24
| | | | Change: 125829994
* Fix linker errors by adding a dependence on constant_op to targets using ↵Gravatar A. Unique TensorFlower2016-06-24
| | | | | | fill_functor. This is needed because the templates in fill_functor.h are instantiated in constant_op.cc. Change: 125816082
* Added tf.container(container_name) context manager.Gravatar Sherry Moore2016-06-24
| | | | | | | | | Added Python interfaces to reset resource containers. pywrap_tensorflow.TF_Reset(target, [containers]) will release resources in all listed containers, while pywrap_tensorflow.TF_Reset(target) will release resources in all the containers. In particular, Variables cached on the devices will be cleared by this call. Change: 125790643
* Update ops-related pbtxt files.Gravatar A. Unique TensorFlower2016-06-24
| | | | Change: 125790131
* Implement gradient for StridedSlice op.Gravatar Andrew Selle2016-06-24
| | | | | | | | | StridedSliceGrad op implements the gradient of StridedSlice. Also implement python benchmark for StridedSlice and simple Slice. Fix bugs in special case optimizations in StridedSlice. (Toward resolving bug #206) Change: 125789921
* Change to const Example pointersGravatar Ben Lee2016-06-24
| | | | | | - Prevent unnecessary tensorflow.Example copies when this method is used in serving. Change: 125788885
* Minor improvements to GcsFileSystem.Gravatar Alexey Surkov2016-06-24
| | | | | | | - adds support of paths pointing to the root of a bucket, e.g gs://bucket - adds pagination support for GetChildren - makes a more optimal HTTP request in GetChildren Change: 125785863