aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core/common_runtime/simple_placer.h
Commit message (Collapse)AuthorAge
* SimpleGraphExecutionState -> GraphExecutionStateGravatar Suharsh Sivakumar2017-09-07
| | | | | | | SimplePlacer -> Placer And clean up a couple unneeded headers. PiperOrigin-RevId: 167955883
* This change significantly reduces time and resources used to load large ↵Gravatar A. Unique TensorFlower2017-06-01
| | | | | | | | | | | | | | | | | | | | | | | | TensorFlow graphs. For a real-world large graph (13k nodes, 20k edges), this change: * reduces all heap allocations by 19% * reduces retained (final) heap allocations by 2.2% * reduces CPU time by 11.2% In most TF graphs, the set of unique values set to Node::assigned_device_name() is quite small. This change adds an interning table to the Graph object, which contains all of the unique values used for Node::set_assigned_device_name(), as well as a look-up table. This is the main source of the reduction in retained heap memory; nearly all nodes are assigned to just one or two unique devices. This change removes the "string assigned_device_name_" field from the Node class, and replaces it with "int assigned_device_name_index_". However, because you need both the index and the name table to get the actual value, the Node::assigned_device_name() accessor needs access to the parent Graph. This requires adding a "Graph* graph_" field to the Node class. In the future, if all users of this property are converted to use Graph::assigned_device_name(Node*), then the Node::graph_ field can be deleted, and the space reclaimed. However, doing so is out of the scope of this CL, and even with this new pointer field, the Node class is smaller than it was before, so this is still a net win. The placement algorithm in simple_placer.cc is one of the main accessors of the Node::assigned_device_name property. This CL contains significant changes to simple_placer.cc, which directly take advantage of the fact that the property is an index into a name table, rather than treating it simply as a string. Many temporary allocations are also removed, which is the main source of the reduction in total heap allocations. This CL also contains a few changes that remove short-lived allocations in unrelated code, such as the changes in op.cc/h, costmodel.cc, etc. It is extremely easy in C++ to accidentally allocate memory, especially when implicit conversions and copy constructors allocate memory. All of the changes in this CL were motivated by empirical measurement, using CPU profiling and heap profiling. PiperOrigin-RevId: 157762909
* SimplePlacer: apply heuristics only if the candidate device is validGravatar Vijay Vasudevan2017-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | (is in the list of passed in devices). Before this change, if the candidate_device_name was /cpu:0 but the list of valid devices was /cpu:1 (because that's what the user specified), we would still apply the heuristics to assign the device to candidate_device_name, since the device type is the same. However, we never want to ignore what the user has specified, so we properly check that the device name matches one of the devices in the list of valid devices for that node (as determined by the user or the placer constraints). This adds tests to verify this behavior. Also added a note about the check for assigned_device_name in one of the loops. The check is not strictly necessary: the act of adding a node to the colocation_group structure at the beginning function reads existing assigned_device_name and populates the list of possible devices, so GetDevicesByNode() will always return a single device in this case. However, this check avoids some extra computation that isn't needed, so it's still valid to have. I now do add a AssignAndLog statement to make sure stateful placements are logged (before this change, they weren't). cc @DavidNorman Change: 143978397
* Merge changes from github.Gravatar Jonathan Hseu2016-09-29
| | | | Change: 134721831
* Update copyright for 3p/tf/core.Gravatar A. Unique TensorFlower2016-06-02
| | | | Change: 123900938
* TensorFlow: implement placement heuristic that takes generator nodesGravatar Vijay Vasudevan2016-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | (nodes with one non-ref output and one consumer), and places it preferentially with its consumer. For example: assign / \ var input In the above graph, assign is bound to the device of 'var' due to the reference edge. This heuristic binds 'input' to the same device as the assign, because it has only one consumer. This addresses the general problem of colocating initializers with their variables, and similar other cases. There are very few reasons to want to place the 'input' on a node other than its consumer (there are some contrived cases, but that's why this is a heuristic). This CL adds a test case for this small example above, illustrative of the general problem. An extension of this CL would be to do the same thing not just for single output / single consumer nodes, but whenever all out edges of a node connect to the same 'colcoation group'. Change: 123896863
* TensorFlow: Move assignment / choice of device in SimplePlacer out ofGravatar Vijay Vasudevan2016-05-02
| | | | | | | | | | | | | | AssignDevice and into the main loop, for a future refactor where we inject the choice of algorithm to use when selecting a device for a node. Currently this continues to use just the first device in the list, but we would like to be able to play around with algorithms that choose alternative strategies, perhaps based on other heuristics and runtime information. SimplePlacer remains the code that does performs the precondition filters (hard-device assignment and validation), so that other placement algorithms don't have to worry about enforcing the correct assignments / conditions. Change: 121339923
* SimplePlacer: remove obsolete / never used @colocation device name (supercededGravatar Vijay Vasudevan2016-05-02
| | | | | | | | | | | | by colocation_groups in _class attr), cleanup calls to pass in name to id map, which is no longer needed in SimplePlacer. Should speed up graph construction in C++ because we don't need to iterate over all of the nodes once to build the map. In the future, utilities should rely on node ids instead of node names so the map is not necessary (ideally). Alternatively, a structure in Graph should maintain the mapping. Change: 121302549
* Global search & replace to move to the new location forGravatar Josh Levenberg2016-01-26
| | | | | tensorflow/core/ files and build targets. Change: 113078283
* #include third_party/tensorflow/core/platform/macros.hGravatar Josh Levenberg2016-01-06
| | | | | directly so we can drop it from port.h. Change: 111506630
* TensorFlow: Improve performance of AlexnetGravatar Manjunath Kudlur2015-11-20
| | | | | | | | | | | | | | | | | | | | | | Changes: * error message that refers to removed `DefaultSession` method. * -Wnull-conversion warnings * the "_start_time" attr for recvs when the flag "--brain_enable_scheduling_for_recvs" is set. * typo in tutorial data download progress message. * a typo ("however their installing"=>"however installing"). * typo, rename "TensorFlow Mechanics" to "How To" to be consistent with the website. * a typo ("subtact"=>"subtract"). * protobuf examples in comments in tensorflow::Example.proto. * formula formatting in MNIST beginner tutorial * negative fraction-of-queue-full stats * protobuf inclusion path so that Android demo will build under Blaze. * small typo (moderatly > moderately) * Session.run() to check that tensor arguments come from the session's graph. * another six import * seq2seq typo in bazel command Base CL: 108349164
* TensorFlow: Doc and linter fixes, some additional tests andGravatar Vijay Vasudevan2015-11-16
| | | | | | | | | | | | | | error handling, updates to website. Changes: - Removes redundant reshape from image models by @mrry - Default TensorBoard to localhost by @danmane - Reformatting of tensorflow/core by @josh11b - Make tutorials backwards compatible to 0.5.0 by @girving - Improve print documentation (md files not updated). - Add proper scrolling to sitemap by @martinwicke Base CL: 107956254
* TensorFlow: Initial commit of TensorFlow library.Gravatar Manjunath Kudlur2015-11-06
TensorFlow is an open source software library for numerical computation using data flow graphs. Base CL: 107276108